THE ANNALS 
of 
MATHEMATICAL 
STATISTICS 


(FOUNDED BY H. C. CARVER) 
THE OFFICIAL JOURNAL OF THE INSTITUTE 
OF MATHEMATICAL STATISTICS 


Contents 
Sequential Tests of Statistical Hypotheses. A. Wap 


Non-Parametric Estimation. I. Validation of Order Statistics. 
H. Scuerré AND J. W. Tukey 


On a Test for Randomness Based on Signs of Differences. Henry 


The Asymptotic Distribution of Runs of Consecutive Elements. 
IRvING KAPLANSKY 


On the Approximate Distribution of Ratios. P. L. Hsu 


On the Distribution of the Serial Correlation Coefficient. ‘HERMAN 


A Note Concerning Hotelling’s Method of Inverting a Partitioned 
Matrix. F. V. Waves 


News and Notices 





Insurance THE ANNALS 


Library 


un (OF MATHEMATICAL STATISTICS 


‘ (A lo EDITED BY a 
» 8. 8. WILKS, Editor 


( 


C. C. CRAIG W. FELLER J. NEYMAN a 
ALLEN T. CRAIG THORNTON C. FRY WALTER A. SHEWHARE 
W. EDWARDS DEMING HAROLD HOTELLING A. WALD 


WITH THE COOPERATION OF 


Witi1am G. CocHRAN Pavut 8. Dwyer Wiiiram G. Mapow 

J. H. Curtiss CHURCHILL EISENHART ALEXANDER M. Moon “= 
J. F. Daty Pavut R. Hatmos Henry Scuerrt i 
Harotp F. Dover Pavut G. Hore. Jacos. WoLFow!Tz 


The ANNALS OF Matematica Sratistics is published quarterly by the 
Institute of Mathematical Statistics, Mt. Royal & Guilford Aves., Baltimore 3 
Md. Subscriptions, renewals, orders for back numbers and other tiialnai com=” 
munications should be sent to the ANNALS oF MarTHEMaTicaL Statistics, Mt, 
Royal & Guilford Aves., Baltimore 2, Md., or to the Secretary of the Insti< 
tute of Mathematical Statistics, P. 8S. Dwyer, 116 Rackham Hall, University off 
Michigan, Ann Arbor, Mich. 

Changes in mailing address which are to become effective for a gives 
issue should be reported to the Secretary on or before the 15th of the: 
month preceding the month of that issue. The months of issue are March, | 
June, September and December. Because of war-time difficulties of publicas) 
tion, issues may often be from two to four weeks late in appearingsy 
Subscribers are therefore requested to wait at least 30 days after month of + 
before making inquiries concerning non-delivery. 


Manuscripts for publication in the ANNALS oF MaTHEMaTicaL STaTIsTIGS 
should be sent to 8. S. Wilks, Fine Hall, Princeton, New Jersey. Manuscripts : 
should be typewritten double-spaced with wide margins, and the original copy” 
should be submitted. Footnotes should be reduced to a minimum and whenev 
possible replaced by a bibliography at the end of the paper; formulae in foots | . 
notes should be avoided. Figures, charts, and diagrams should be drawn on * 
plain white paper or tracing cloth in black India ink twice the size they are to | 
be printed. Authors are requested to keep in mind typographical difficulties” 
of complicated mathematical formulae. E 


Authors will ordinarily receive only galley proofs. Fifty reprints without 3 
covers will be furnished free. Additional reprints and covers furnished at cost. 


The subscription price for the ANNALS is $5.00 per year. Single copies $1. 50. . 
Back numbers are available at $5.00 per volume, or $1.50 per single issue. 


CoMPOSED AND PRINTED AT THE 
WAVERLY PRESS, Inc. 
Bautrmore, Mp., U.S. A. 


Entered as second-class matter at the Post Office at Baltimore, Maryland, under the Act of March 3, 1879 








a 


my 








SEQUENTIAL TESTS OF STATISTICAL HYPOTHESES 
By A. Wap 


Columbia University 


TABLE OF CONTENTS 


Page 
ae bee wl ara ed boens debe 118 
ERR eC a Oe ee aC OP 119 


Part I. Sequential test of a simple hypothesis against a single alternative 


ee Oi ss ond ee weeded eed ev eened dh aadedacses 
2. The sequential test procedure: general definitions.................... 


2.1. Notion of a sequential Test. 2.2. Efficiency of a sequential test. 2.3. 
Efficiency of the current procedure, viewed as a particular case of a sequential 
test. 


Be. Dommewtenl probabsttty ratio test... 5... ccc ccc cee ccc ccescecn 


3.1. Definition of the sequential probability ratio test. 3.2. Fundamental 
relations among the quantities a,8, A and B. 3.3. Determination of the values 
A and Bin practice. 3.4. Probability of accepting Ho (or H,) when some third 
hypothesis H is true. 3.5. Calculation of 6 and 7 for binomial and normal 
distributions. 


4. The number of observations required by the sequential probability ratio 


4.1. Expected number of observations necessary for reaching a decision. 4.2. 
Calculation of the quantities & and é’ for binomial and normal distributions. 
4.3. Saving in the number of observations as compared with the current test pro- 
cedure. 4.4. The characteristic function, the moments and the distribution of 
the number of observations necessary for reaching a decision. 4.5. Lower limit 
of the probability that the sequential process will terminate with a number of 
trials less than or equal toa givennumber. 4.6. Truncated sequential analysis. 
4.7. Efficiency of the sequential probability ratio test. 


Part II. Sequential test of a simple or composite hypothesis against a set of 
alternatives 
5. Test of a simple hypothesis against one-sided alternatives............. 


5.1. General Remarks. 5.2. Application to binomial distributions. 5.3. Se- 
quential analysis of double dichotomies. 5.4. Application to testing the mean 
of a normal distribution with known standard deviation. 

6. Outline of a general theory of sequential tests of hypotheses when no re- 
strictions are imposed on the alternative values of the unknown 
ie itp ee. 6 AE MTR REAR ESN OSHS SRE RE RE RS 


6.1. Sequential test of a simple hypothesis with no restrictions on the alterna- 
tive values of the unknown parameters. 6.2. Sequential test of a composite 
hypothesis. 


117 


122 
123 


142 














A. WALD 


A. Introduction 


By a sequential test of a statistical hypothesis is meant any statistical test 
procedure which gives a specific rule, at any stage of the experiment (at the 
n-th trial for each integral value of n), for making one of the following three 
decisions: (1) to accept the hypothesis being tested (null hypothesis), (2) to 
reject the null hypothesis, (3) to continue the experiment by making an addi- 
tional observation. Thus, such a test procedure is carried out sequentially. 
On the basis of the first trial, one of the three decisions mentioned above is made. 
If the first or the second decision is made, the process is terminated. If the 
third decision is made, a second trial is performed. Again on the basis of the 
first two trials one of the three decisions is made and if the third decision is 
reached a third trial is performed, etc. This process is continued until either 
the first or the second decision is made. 

An essential feature of the sequential test, as distinguished from the current 
test procedure, is that the number of observations required by the sequential 
test is not predetermined, but is a random variable due to the fact that at any 
stage of the experiment the decision of terminating the process depends on the 
results of the observations previously made. The current test procedure may 
be considered a limiting case of a sequential test in the following sense: For any 
positive integer n less than some fixed positive integer N, the third decision is 
always taken at the n-th trial irrespective of the results of these first n trials. 
At the N-th trial either the first or the second decision is taken. Which decision 
is taken will depend, of course, on the results of the N trials. 

In a sequential test, as well as in the current test procedure, we may commit 
two kinds of errors. We may reject the null hypothesis when it is true (error 
of the first kind), or we may accept the null hypothesis when some alternative 
hypothesis is true (error of the second kind). Suppose that we wish to test the 
null hypothesis Ho against a single alternative hypothesis H, , and that we want 
the test procedure to be such that the probability of making an error of the 
first kind (rejecting Hp when HA, is true) does not exceed a preassigned value a, 
and the probability of making an error of the second kind (accepting Ho when 
H, is true) does not exceed a preassigned value 8. Using the current test pro- 
cedure, i.e., a most powerful test for testing Ho against H, in the sense of the 
Neyman-Pearson theory, the minimum number of observations required by the 
test can be determined as follows: For any given number N of observations a 
most powerful test is considered for which the probability of an error of the first 
kind is equal to a. Let B(N) denote the probability of an error of the second 
kind for this test procedure. Then the minimum number of observations is 
equal to the smallest positive integer N for which B(N) < 8. 

In this paper a particular test procedure, called the sequential probability 
ratio test, is devised and shown to have certain optimum properties (see section 
4.7). The sequential probability ratio test in general requires an expected num- 
ber of observations considerably smaller than the fixed number of observations 
needed by the current most powerful test which controls the errors of the first 
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and second kinds to exactly the same extent (has the same a and £) as the se- 
quential test. The sequential probability ratio test frequently results in a 
saving of about 50% in the number of observations as compared with the cur- 
rent most powerful test. Another surprising feature of the sequential prob- 
ability ratio test is that the test can be carried out without determining any 
probability distributions whatsoever. In the current procedure the test can be 
carried out only if the probability distribution of the statistic on which the test 
is based is known. This is not necessary in the application of the sequential 
probability ratio test, and only simple algebraic operations are needed for carry- 
ing it out. Distribution problems arise in connection with the sequential prob- 
ability ratio test only if we want to make statements about the probability dis- 
tribution of the number of observations required by the test. 

This paper consists of two parts. Part I deals with the theory of sequential 
tests for testing a simple hypothesis against a single alternative. In Part II a 
theory of sequential tests for testing simple or composite hypotheses against 
infinite sets of alternatives is outlined. The extension of the probability ratio 
test to the case of testing a simple hypothesis against a set of one-sided alterna- 
tives is straight forward and does not present any difficulty. Applications to 
testing the means of binomial and normal distributions, as well as to testing 
double dichotomies are given. The theory of sequential tests of hypotheses 
with no restrictions on the possible values of the unknown parameters is, how- 
ever, not as simple. There are several unsolved problems in this case and it is 
hoped that the general ideas outlined in Part II will stimulate further research. 

Sections 5.2, 5.3 and 5.4 in Part II deal with the applications of the sequential 
probability ratio test to binomial distributions, double dichotomies and normal 
distributions. These sections are nearly self-contained and can be understood 
without reading the rest of the paper. Thus, readers who are primarily in- 
terested in these special cases of the sequential probability ratio test rather than 
in the general theory, may profitably read only the above mentioned sections. 
For the benefit of readers who lack a sufficient background in the mathematical 
theory of statistics the exposition in sections 5.2, 5.3 and 5.4 is kept on a fairly 
elementary level. 

It should be pointed sut that whenever the number of observations on which 
the test is based is for some reason determined in advance, for instance, if certain 
data are available from past history and no additional data can be obtained, then 
the current most powerful test procedure is preferable. The superiority of the 
sequential probability ratio test is due to the fact that it requires a smaller ex- 
pected number of observations than the current most powerful test. This 
feature of the sequential probability ratio test is, however, of no value if the num- 
ber of observations is for some reason determined in advance. 


B. Historical Note 


To the best of the author’s knowledge the first idea of a sequential test, i.e., 
a test where the number of observations is not predetermined but is dependent 
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on the outcome of the observations, goes back to H. F. Dodge and H. G. Romig 
who proposed a double sampling inspection procedure [1]. In this double samp- 
ling scheme the decision whether a second sample should be drawn or not de- 
pends on the outcome of the observations in the first sample. The reason for 
introducing a double sampling method was, of course, the recognition of the fact 
that double sampling results in a reduction of the amount of inspection as com- 
pared with “single” sampling. 

The double sampling method does not fully take advantage of sequential 
analysis, since it does not allow for more than two samples. A multiple sampling 
scheme for the particular case of testing the mean of a binomial distribution was 
proposed and discussed by Walter Bartky [2]. His procedure is closely related 
to the test which results from the application of the sequential probability ratio 
test to testing the mean of a binomial distribution. Bartky clearly recognized 
the fact that multiple sampling results in a considerable reduction of the average 
amount of inspection. 

The idea of chain experiments discussed briefly by Harold Hotelling [3] is also 
somewhat related to our notion of sequential analysis. An interesting example 
of such a chain of experiments is the series of sample censuses of area of jute in 
Bengal carried out under the direction of P. C. Mahalanobis [6]. The succes- 
sive preliminary censuses, steadily increasing in size, were primarily designed to 
obtain some information as to the parameters to be estimated so that an efficient 
design could be set up for the final sampling of the whole immense jute area in 
the province. 

In March 1943, the problem of sequential analysis arose in the Statistical 
Research Group, Columbia University,’ in connection with a specific question 
posed by Captain G. L. Schuyler of the Bureau of Ordnance, Navy Department. 
It was pointed out by Milton Friedman and W. Allen Wallis that the mere notion 
of sequential analysis could slightly improve the efficiency of some current most 
powerful tests. This can be seen as follows: Suppose that N is the planned 
number of trials and Wy is a most powerful critical region based on N observa- 
tions. If it happens that on the basis of the first n trials (n < N) it is already 
certain that the completed set of N trials must lead to a rejection of the null 
hypothesis, we can terminate the experiment at the n-th trial and thus save some 
observations. For instance, if Wx is defined by the inequality aj + ... +ax>e, 
and if for some n < N we find that aj + ... + 2° > c, we can terminate the 
process at this stage. Realization of this naturally led Friedman and Wallis to 
the conjecture that modifications of current tests may exist which take advantage 
of sequential procedure and effect substantial improvements. More specifically, 
Friedman and Wallis conjectured that a sequential test may exist that controls 
the errors of the first and second kinds to exactly the same extent as the current 

1 The Statistical Research Group operates under a contract with the Office of Scientific 
Research and Development and is directed by the Applied Mathematics Panel of the 
National Defense Research Committee. 
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most powerful test, and at the same time requires an expected number of observa- 
tions substantially smaller than the number of observations required by the 
current most powerful test.? 

It was at this stage that the problem was called to the attention of the author 
of the present paper. Since infinitely many sequential test procedures exist, 
the first and basic problem was, of course, to find the particular sequential test 
procedure which is most efficient, i.e., which effcets the greatest possible saving 
in the expected number of observations as compared with any other (sequential 
or non-sequential) test. In April, 1943 the author devised such a test, called 
the sequential probability ratio test, which for all practical purposes is most 
efficient when used for testing a simple hypothesis Hy against a single alterna- 
tive M,. 

Because of the substantial savings in the expected number of observations 
effected by the sequential probability ratio test, and because of the simplicity 
of this test procedure in practical applications, the National Defense Research 
Committee considered these developments sufficiently useful for the war effort 
to make it desirable to keep the results out of the reach of the enemy, at least for 
a certain period of time. The author was, therefore, requested to submit his 
findings in a restricted report [7] which was dated September, 1943.° In this 
report the sequential probability ratio test is devised and its mathematical theory 
is developed. In July 1944 a second report [8] was issued by the Statistical 
Research Group which gives an elementary non-mathematical exposition of 
the applications of the sequential probability ratio test, together with charts, 
tables and computational simplifications to facilitate applications. 

Independently of the developments here, G. A. Barnard [9] recognized the 
merits of a sequential method of testing, i.e., the possibility of a saving in the 
number of observations as compared with the current most powerful test. He 
also devised an interesting sequential test for testing double dichotomies, which 
differs from the one obtained by applying the sequential probability ratio test. 

Some further developments in the theory of the sequential probability ratio 
test took place in 1944. Extending the methods used in [7], C. M. Stockman 
[10] found the operating characteristic curve of the sequential probability ratio 
test applied to a binomial distribution. Independently of Stockman, Milton 
Friedman and George W. Brown (independently of each other) obtained the 
same result which can be extended to the normal distribution and a few other 
specific distributions, but is not applicable to more general distributions. The 
general operating characteristic curve for any sequential probability ratio test 
was derived by the author [11]. A few months later the author developed a 
general theory of cumulative sums [4] which gives not only the operating char- 





2 Bartky’s multiple sampling scheme [2] for testing the mean of a binomial distribution 
provides, of course, an example of such a sequential test (see, for example, the remarks on 
p. 377 in [2]). Bartky’s results were not known to us at that time, since they were published 
nearly a year later. 

’ The material was recently released making the present publication possible. - 
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acteristic curve for any sequential probability ratio test but also the character- 
istic function of the number of observations required by the test. 

The theory of the sequential probability ratio test as given in the present 
paper differs considerably from the exposition given in [7], since the new de- 
velopments in [4] have been taken into account. However, some tables and a 
few sections of the original report [7] are included in the present paper without 
any substantial changes. 


Part I. SEQUENTIAL TEST OF A SIMPLE HyporHEsis AGAINST A 
SINGLE ALTERNATIVE ; 


1. The Current Test Procedure 


Let X be a random variable. In what follows in this and the subsequent 
sections it will be assumed that the random variable X has either a continuous 
probability density function or a discrete distribution. Accordingly, by the 
probability distribution f(#) of a random variable X we shall mean either the 
probability density function of X or the probability that X = x, depending upon 
whether X is a continuous or a discrete variable. Let the hypothesis H» to be 
tested (null hypothesis) be the statement that the distribution of X is fo(z). 
Suppose that Ho is to be tested against the single alternative hypothesis H, that 
the distribution of X is given by f;(2). 

According to the Neyman-Pearson theory of testing hypotheses a most power- 
ful critical region Wy for testing H» against H; on the basis of N independent 


observations 21, --+ , vy on X is given by the set of all sample points (71, --- , 
xy) for which the inequality 


(1.1) file files) +++ filew) x 
fo(xr)fo(x2) +++ folrn) ~ 

is fulfilled. The quantity k on the right hand side of (1.1) is a constant and is 

chosen so that the size of the critical region, i.e., the probability of an error of 

the first kind should have the required value a. 

For a fixed sample size N the probability 8 of an error of the second kind is a 
single valued function of a, say By(a@), if a most powerful critical region is used. 
Thus, if in addition to fixing the value of a@ it is required that the probability of 
an error of the second kind should have a preassigned value 8, or at least it should 
not exceed a preassigned value 8, we are no longer free to choose the sample size 
N. The minimum number of observations required by the test satisfying these 
conditions is equal to the smallest integral value of N for which By(a) < 8. 

Thus, the current most powerful test procedure for testing Ho against H; can 
be briefly stated as follows: We choose as critical region the region defined by 
(1.1) where the constant k is determined so that the probability of an error of 
the first kind should have a preassigned value a and N is equal to the smallest 
integer for which the probability of an error of the second kind does not exceed 
a preassigned value 6. 
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2. The Sequential Test Procedure: General Definitions 


2.1. Notion of a sequential test. In current tests of hypotheses the number of 
observations is treated as a constant for any particular problem. In sequential 
tests the number of observations is no longer a constant, but a random variable. 
In what follows the symbol n is used for the number of observations required by 
a sequential test and the symbol N is used when the number of observations is 
treated as a constant. 

Sequential tests can be described as follows: For each positive integer m the 
m-dimensional sample space M,, is subdivided into three mutually exclusive 
parts R},, Rn and R,,. After the first observation x; has been drawn Hy is 
accepted if 2; lies in R} , Ho is rejected (i.e., Hi is accepted) if x; lies in R} , or a 
second observation is drawn if x; liesin R,. If the third decision is reached and 
a second observation 22 drawn, Ho is accepted, H; is accepted, or a third observa- 
tion is drawn according as the point (x1, x2) lies in R2, Ri or in R.. If (x1, x) 
lies in R; , a third observation 2; is drawn and one of the three decisions is made 
according as (21, 22, 23) lies in R3, R3 or in R;, etc. This process is stopped 
when, and only when, either the first decision or the second decision is reached. 
Let n be the number of observations at which the process is terminated. Then 
n is a random variable, since the value of n depends on the outcome of the 
observations. (It will be seen later that the probability is one that the sequential 
process will be terminated at some finite stage.) 

We shall denote by Eo(n) the expected value of n if Ho is true and by E,(n) 
the expected value of n if H; is true. These expected values, of course, depend 
on the sequential test used. In order to put this dependence in evidence, we 
shall occasionally use the symbols Eo(n | S) and E,(n | S) to denote the values 
E,(n) and E,(n), respectively, when the sequential test S is applied. 

2.2. Efficiency of a sequential test. As in the current test procedure, errors of 
two kinds may be committed in sequential analysis. We may reject Ho when 
it is true (error of the first kind), or we may accept Ho when H, is true (error of 
the second kind). With any sequential test there will be associated two num- 
bers a and 8 between 0 and 1 such that if Hp is true the probability is a that we 
shall commit an error of the first kind and if H; is true, the probability is 6 that 
we shall commit an error of the second kind. We shall say that two sequential 
tests S and S’ are of equal strength if the values a and 6 associated with S are 
equal to the corresponding values a’ and 8’ associated with S’. If a < a’ and 
B < B’, or if a < a’ and B < 8’, we shall say that S is stronger than S’(S’ is 
weaker than S). If a> a’ andB < 8’, or if a < a’ and B > 8’, we shall say 
that the strength of S is not comparable with that of S’. 

Restricting ourselyes to sequential tests of a given strength, we want to make 
the number of observations necessary for reaching a final decision as small as 
possible. If S and S’ are two sequential tests of equal strength we shall say 
that S’ is better than S if either Eo(n | S’) < E,(n| S) and E,(n| 8’) < Ey 
(n| 8), or Eo(n | S’) < Eo(n | S) and E,(n | 8’) < Ex(n| 8S). A sequential test 
will be said to be an admissible test if no better test of equal strength exists. 
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If a sequential test S satisfies both inequalities Eo(n |S) < E,y(n |S’) and EF, 
(n| S) < E,(n | 8S’) for any sequential test S’ of strength equal to that of S, then 
the test S can be considered to be a best sequential test. That such tests exist, 
i.e., that it is possible to minimize E)(n) and E,(n) simultaneously, is not proved 
here; but it is shown later (section 4.7) that for the so called sequential prob- 
ability ratio test defined in section 3.1 both Eo(n) and E,(n) are very nearly 
minimized.’ Thus, for all practical purposes the sequential probability ratio 
test can be considered best. 

Since it is unknown that a sequential test always exists for which both Eo(n) 
and £,(n) are exactly minimized, we need a substitute definition of an optimum 
test. Several substitute definitions are possible. We could, for example, re- 
quire that the test be admissible and the maximum of the two values E)(n) and 
E,(n) be minimized, or that the mean eda) + — 
average be minimized. All these definitions are equivalent if a sequential test 
exists for which both Eo(n) and E;(n) are minimized; but if they cannot be mini- 
mized simultaneously the definitions differ. Which of them is chosen is of no 
significance for the purpose of this paper, since for the sequential probability 
ratio test proposed later both expected values y(n) and E£,(n) are, if not exactly, 
very nearly minimized. If we had a priori knowledge as to how frequently Ho 
and how frequently H, will be true in the long run, it would be most reasonable 
to minimize a weighted average (weighted by the frequencies of Hp and H,, 
respectively) of Eo(n) and E\(n). However, when such knowledge is absent, 
as is usually the case in practical applications, it is perhaps more reasonable to 
minimize the maximum of £)(n) and E£,(n) than to minimize some weighted 
average of Ey(n) and E\(n). Hence the following definition is introduced. 

A sequential test S is said to be an optimum test if S is admissible and Max 
[Eo(n | S), Eis(n | S)] < Max [Eo(n | 8’), Ei(m | S’)] for all sequential tests S’ of 
strength equal to that of S. 

By the efficiency of a sequential test S is meant the value of the ratio’ 

Max [E,(n | S*), Ei(n | S*)| 


“Max [Fo(n | 8), Ei(n| S)] 


, or some other weighted 


where S* is an optimum sequential test of strength equal to that of S. 

2.3. Efficiency of the current procedure, viewed as a particular case of a sequential 
test. The current test procedure can.be considered as a particular case of a 
sequential test. In fact, let NV be the size of the sample used in the current pro- 
cedure and let Wy be the critical region on which the test is based. Then the 


4The author conjectures that £o(n) and E,(n) are exactly minimized for the sequential 
probability ratio test, but he did not succeed in proving this, except for a special class of 
problems (see section 4.7). 

5 The existence of an optimum sequential test is not essential for the definition of effi- 
ciency, since Max [Eo(n | S*), £i(n | S*)| could be replaced by the greatest lower bound of 
Max [E(on | S’), E:(n | S’)] with respect to all sequential tests S’ of strength equal to that 
of S. 
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current procedure can be considered as a sequential test defined as follows: For 
allm < N, the regions R?, , Rn are the empty subsets of the m-dimensional sample 
space M,,,and Rn = Mn. Form = N, Ry is equal to Wy, Ry is equal to the 
complement Wy of Wy and Ry is the empty set. Thus, for the current pro- 
cedure we have E,(n) = E\(n) = N. 

It will be seen later that the efficiency of the current test based on the most 
powerful critical region is rather low. Frequently it is below 3. In other words, 
an optimum sequential test can attain the same a and £ as the current most 
powerful test on the basis of an expected number of observations much smaller 
than the fixed number of observations needed for the current most powerful test. 

In the next section we shall propose a simple sequential test procedure, called 
the sequential probability ratio test, which for all practical purposes can be con- 
sidered an optimum sequential test. It will be seen that these sequential tests 
usually lead to average savings of about 50% in the number of trials as compared 
with the current most powerful test. 


3. Sequential Probability Ratio Test 


3.1. Definition of the sequential probability ratio test. We have seen in section 
2.1 that the sequential test procedure is defined by subdividing the m-dimensional 
sample space M,,(m = 1, 2,---, ad inf.) into three mutually exclusive parts 

», Rn and R,,. The sequential process is terminated at the smallest value n 
of m for which the sample point lies either in R® orin R‘,. If the sample point 
lies in R°, we accept Hp and if it lies in R', we accept H,. 

An indication as to the proper choice of the regions R°,, Ri, and R,» can be 
obtained from the following considerations: Suppose that before the sample is 
drawn there exists an a priori probability that Ho is true and the value of this 
probability is known. Denote this a priori probability by g). Then the a priori 
probability that H, is true is given by g; = 1 — go, since it is assumed that the 
hypotheses Ho and H, exhaust all possibilities. After a number of observations 
have been made we gain additional information which will affect the probability 
that H; (¢ = 0,1) is true. Let gom be the a posteriori probability that Ho is true 
and gim the a posteriori probability that H, is true after m observations have been 
made. Then according to the well known formula of Bayes we have 


‘ go Pom(21 , 7 Sed 











(3.1) Jom = ae = 7 7. eis 2 = 
Jo Pom\X1 es Za) + gi Pim(X1 ce Lu) 
and 
| 1Pim(21, °** , Tm) 
(3.2) Jim = - " P1Pim(Zi 5 ne . 
Jo Pom(11, +++ 5 Lm) + Gi Pim(X1, +++, Lm) 
where Pin(@, +++, 2m) denotes the probability density in the m-dimensisnal 
° . 1 6 ° 
sample space calculated under the hypothesis H; (¢ = 0,1).° As an abbrevia- 
tion for Pim(%1, °** , 2m) we shall use simply pi», . 
6 If the probability distribution is discrete pim(X; , +++ , Ym) denotes the probability that 


the sample point (x; , --- , 2m) will be obtained. 
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Let do and d, be two positive numbers less than 1 and greater than 4. Suppose 
that we want to construct a sequential test such that the conditional probability 
of a correct decision under the condition that Ho is accepted is greater than or 
equal to dy, and the conditional probability of a correct decision under the 
condition that H, is accepted is greater than or equal tod, .’. Then the following 
sequential process seems reasoriable: At each stage calculate gom and gim. If 
Jim = d,, accept Hi. If gom > do, accept Ho. If gim < di and gom < do, draw 
an additional observation. >, in this sequential process is thus defined by the 
inequality gom > do, Rm by the inequality g:, > d:, and R,, by the simultaneous 
inequalities gim < d; and gom < dy. It is necessary that the sets R?,, Rn and 
R» be mutually exclusive and exhaustive. For this it suffices that the in- 
equalities 








3.3 .= Pm = >s@ 
( ) ” Jo Pom + N Pim ~ ; 
and 

(3.4) Jon = Pe ene = do 


Jo Pom + gi Pim 


be not fulfilled simultaneously. To show that (3 3) and (3.4) are incompatible, 
we shall assume that they are simultaneously fulfilled and derive a contradiction 
from this assumption. The two inequalities sum to 


(3.5) Jim + Jom >da+d. 
Since gom + gim = 1, we have 
[2 di + dy 
which is impossible, since by assumption d; > 3 (¢ = 0,1). Hence it is proved 
that the sets R}, , R}, and R,, are mutually exclusive and exhaustive. 


The inequalities (3.3) and (3.4) are equivalent to the following inequalities, 
respectively: 








Pim Jo d, 
3.6 >t 
( ) Pom gi l1-d 
and 
~ Pim » gol do 
3. — << = 
( ‘) Pom = gi do 


The constants on the right hand sides of (3.6) and (3.7) do not depend on m. 
If an a priori probability of Ho does not exist, or if it is unknown, the inequali- 
ties (3.6) and (3.7) suggest the use of the following sequential test: At each stage 


7 The restriction d) > 1/2 and d; > 1/2 are imposed because otherwise it might happen 
that the hypothesis with the smaller a posteriori probability will be accepted. 
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calculate pim/Pom. If pim = Pom = 0, the value of the ratio pim/Pom is defined 
to be equal to 1. Accept A; if 


(3.8) > A. 
Pom 

Accept [Ho if 

(3.9) ~<s 
Pom 


Take an additional observation if 


(3.10) a< <A. 
Om 

Thus, the number nv of observations required by the test is the smallest integral 
value of m for which either (3.8) or (3.9) holds. The constants A and B are 
chosen so that 0 < B < A and the sequential test has the desired value a of the 
probability of an error of the first kind and the desired value 8 of the probability 
of an error of the second kind. We shall call the test procedure defined by (3.8), 
(3.9) and (3.10), a sequential probability ratio test. 

The sequential test procedure given by (3.8), (3.9) and (3.10) has been justi- 
fied here merely on an intuitive basis. Section 4.7, however, shows that for this 
sequential test the expected values Eo(n) and E,(n) are very nearly minimized.* 
Thus, for practical purposes this test can be considered an optimum test. 

3.2. Fundamental relations among the quantities a, 8, A and B. In this section 
the quantities a, 8, A and B will be related by certain inequalities which are of 
basic importance for the sequential analysis. 

Let {xm}(m = 1, 2, --- , ad inf.) be an infinite sequence of observations. The 
set of all possible infinite sequences {2} is called the infinite dimensional sample 
space. It will be denoted by M,. Any particular infinite sequence {z,} is 
called a point of 17 ,,. For any set of n given real numbers a,, --- , a, we shall 
denote by C(a, --- , an) the subset of M , which consists of all points (infinite 
sequences) {2m} (m = 1, 2, --- , ad inf.) for which 71 = a, +--+ ,%, =a,. For 
any values a, --- , @, the set C(a,, +--+ , an) will be called a cylindric point of 
order n. A subset S of 1 ,, will be called a cylindric point, if there exists a posi- 
tive integer n for which S is a cylindric point of order n. Thus, a cylindric point 
may be a cylindric point of order 1, or of order 2, ete. A cylindric point C(a,, 

- , a,) will be said to be of type 1 if 


Pin _ fi(ar) fi(ae) «++ filan) >A 


Pon  fo(ai) fo(ae) --+ fo(an) — ~ 





8 It seems likely to the author that Zo(n) and E(n) are exactly minimized for the se- 
quential probability ratio test. However, he did not succeed in proving it, except for a 
special class of problems (see section 4.7). 
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and 


Pim ts fi(ar) ae fi(Qm) 
Pom fo(ar) oom fo(@m) 
A cylindric point C(a;, --- , d,) will be said to be of type 0 if 


Pin - fi(ai) wire Filan) 
Pon fo(a) -++ fo(an) =* 





<A (m =1,---,n—1) 


and 


Pim - fi(an) za fi(Gm) 
s Pom  fo(ai) «+ fo(dm) 


Thus, if a sample (21, --- , 2,) is observed for which C(a1, --- , 2,) is a cylindric 
point of type 7, the sequential test defined by (3.8), (3.9) and (3.10) leads to the 
acceptance of H; (i = 0, 1). 

Let Q; be the sum of all cylindric points of type 7 (¢ = 0,1). For any subset 
M of M,, we shall denote by P;(M) the probability of MW calculated under the 
assumption that H; is true (¢ = 0,1). Now we shall prove that 


(3.11) P (Qo + Q:) = 1 (i= 0, 1) 


This equation means that the probability is equal to one that the sequential 
process will eventually terminate. To prove (3.11) we shall denote the variate 


filvi) by z; and z; + --- + 2m by Zm (2, m = 1, 2,---, adinf.). Further- 


log folxs) 
more, denote by n the smallest integer for which either Z, > log A or Z, < 
log B. If no such finite integer n exists we shall say thatn = «. Clearly, n is 
the number of observations required by the sequential test and (3.11) is proved 
if we show that the probability that n = « is zero. But the latter statement 
was proved by the author elsewhere (see Lemma 1 in [4]). Hence equation 
(3.11) is proved. 

With the help of (3.11) we shall be able to derive some important inequalities 


<A (m=1,---,n—1). 





satisfied by the quantities a, 8, 4 and B. Since for each sample (7, --- , Xn 
for which C(2,, --- , x,) is an element of Q; the inequality pi,/po,n > A holds, 
we see that 
(3.12) P;(Q1) > AP,(Q:) 

Similarly, for each sample (21, +--+ , %n) for which C(x, +--+ , ,) is a point of 


Qo the inequality pin/po. < B holds. Hence 
(3.13) Pi(Qo) < BPo(Qo). 


But Po(Q:) is the probability of committing an error of the first kind and P;(Qo) 
is the probability of making an error of the second kind. Thus, we have 


(3.14) P.(Qi) = a; P(Qo) = 8B. 
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Since Qo and Q, are disjoint, it follows from (3.11) that 








(3.15) Po(Qo) = 1 — a; P\(Qi1) = 1 — 8B. 

From the relations (3.12)-(3.15) we obtain the important inequalities 
(3.16) 1—-B2R->Aea 

and 

(3.17) B<B(1— a). 

These inequalities can be written as 

(3.18) 454 

and 

(3.19) £ - <B. 


The above inequalities are of great value in practical applications, since they 
supply upper lim*ts for a and 8 when A and B are given. For instance, it follows 
immediately from (3.18) and (3.19), and the fact that 0 < a < 1,0 < 6B < 1 that 


1 
(3.20) a< A : 
and 
(3.21) B<B. 


A pair of values a and 8 can be represented by a point in the plane with the 
coordinates a and 8. It is of interest to determine the set of all points (a, 8) 
which satisfy the inequalities (3.18) and (3.19) for given values of A and B. 
Consider the straight lines L; and ZL, in the plane given by the equations 


(3.22) Aa=1-8 
and 
(3.23) B = Bil — a), 


1 . 
respectively. The line L; intersects the abscissa axis at a = ‘A and the ordinate 


axisat 8 = 1. The line L» intersects the abscissa axis at a = 1 and the ordinate 
axis at 8 = B. The set of all points (a, 8) which satisfy the inequalities (3.18) 
and (3.19) is the interior and the boundary of the quadrilateral determined by 
the lines L, , Lx and the coordinate axes. This set is represented by the shaded 
area in figure 1. 

The fundamental inequalities (3.18) and (3.19) were derived under the assump- 
tion that x; , 2, -°-: , ad inf. are independent observations on the same random 
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variable X. The independence of the observations is, however, not necessary 
for the validity of (3.18) and (3.19). In fact, the independence of the observa- 
tions was used merely to show the validity of (3.11). But (3.11) can be shown 
to hold also for dependent observations under very general conditions. Hence, 
if H; states that the joint distribution of 71, 2, +--+ , Xm is given by the joint 
probability density function pim(x1, --- , 2m)’ (i = 0,1; m = 1, 2, --- , ad inf.) 
and if (3.11) holds, then for the sequential test of Hp against H, , as defined by 
(3.8), (3.9) and (3.10), the inequalities (3.18) and (3.19) remain valid. For 
instance, let \»y and \, be two different positive values <1 and let H;(z = 0, 1) 
be the hypothesis that the joint probability density function of 1, --- , %m 1s 
given by 


te 








1 ee ge ae pai Si 2 é 
Dim(X1, ee La = (20)""° jz? i> (xj Nirj-1 (2 = 0, 1) 


1e., that 2, and (7; — Awju)\y = 2, 3,---, ad inf.) are normally and inde- 
pendently distributed with zero means and unit variances, then the inequalities 
(3.18) and (3.19) will hold for the sequential test defined by (3.8), (3.9) and 
(3.10). 

3.3. Determination of the values A and B in practice. Suppose that we wish 
to have a sequential test such that the probability of an error of the first kind is 
equal to a and the probability of an error of the second kind is equal to 8. De- 


9 Of course, for any positive integers m and m’ with m < m’ the marginal distribution of 
Z1,°** , Xm determined on the basis of the joint distribution Pim(x , --+ , m/) must be 
equal to Pim(%1,°°* , 2m). 
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note by A(a, 8) and B(a, 8) the values of A and B for which the probabilities of 
the errors of the first and second kinds will take the desired values a and 8B. 
The exact determination of the values A(a, 8) and B(a, 8) is rather laborious, as 
will be seen in Section 3.4. The inequalities at our disposal, however, permit the 
problem to be solved satisfactorily for practical purposes. From (3.18) and 
(3.19) it follows that 


(3.24) A(a, 6) < +—8 
Qa 


and 


(3.25) B(a, 8) > —” 
l—a 
- ; 1-8 8 
Suppose we put A = —— = a(a, 8) (say), and B = oe b(a, B) (say). 
- pre 

Then A is greater than or equal to the exact value A(a, 8), and B is less than or 
equal to the exact value B(a, 8). This procedure, of course, changes the prob- 
abilities of errors of the first and second kind. If we were to use the exact value 
of B and a value of A which is greater than the exact value, then evidently we 
would lower the value of a, but slightly increase the value of 8. Similarly, if 
we were to use the exact value of A and a value of B which is below the exact 
value, then we would lower the value of 8, but slightly increase the value of a. 
Thus, it is not clear what will be the resulting effect on a and @ if a value of A is 
used which is higher than the exact value, and a value of B is used which is lower 
than the exact value. Denote by a’ and £’ the resulting probabilities of errors 

‘ ‘ . . ‘ - B 
of the first and second kind, respectively, if we put A = —- and B = poaet 

a a 

We now derive inequalities satisfied by the quantities a’, 8’, a and 8. Sub- 
stituting a(a, 8) for A, b(a, 8) for B, a’ for a and #’ for B we obtain from (3.18) 
and (3.19) 
(3.26) a on 

1— p’~ af@,s) 1-8 





and 


(3.27) tied 2 


—a l—ea 


From these inequalities it follows that 


(3.28) ‘cs 


and 


(3.29) 
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Multiplying (3.26) by (1 — 6)(1 — 8’) and (3.27) by (1 — a)(1 — a’) and adding 


the two resulting inequalities, we have 





(3.30) a’ +p Sat B. 


Thus, we see that at least one of the inequalities a’ < a and 6’ < 6 must hold. 
In other words, by using a(a, 8) and b(a, 8) instead of A(a, 8) and B(a, B), re- 
spectively, at most one of the probabilities a and 8 may be increased. 
If a and @ are small (say less than .05), as they frequently will be in practical 
pet a 8 ; 
applications, Sonal and late nearly equal to a and 8, respectively. Thus, 


we see from (3.28) and (3.29) that the quantity by which a’ can possibly exceed 
a, or B’ can exceed 6, must be small. Section 3.4 contains further inequalities 
which show that the amount by which a’(8’) can possibly exceed a(8) is indeed 
extremely small. Thus, for all practical purposes a’ < a and p’ < B. 

If fi(z) (the distribution under the alternative hypothesis) is sufficiently near 
fo(x) (the distribution under the null hypothesis), A(a, 8) and B(a, 8) will be 


1-—B 8 ‘ 
nearly equal to — and | om" respectively; and consequently a’ and 8’ are 
Qa — @ 
also very nearly equal to a and 8B respectively. The reason that (3.18) and 
(3.19) and therefore also (3.24) and (3.25) are inequalities instead of equalities 


° ‘ . ‘ 1n Din 
is that the sequential process may terminate with : “> Aor?" < B. If at 
On Pon 





Pes ' 
the final stage = were exactly equal to A or B, then A(a, 8) and B(a, 8) would 
On 
~ § 8 oe ; wie 
be exactly ———— and eae respectively. If fi(a) is near fo(x), it is almost 
a — & 


, tes ; _ 
certain that the value of p changed only slightly by one additional observa- 
Or 


, in , | . ; 
tion. Thus, at the final stage = will be only slightly above A, or slightly below 


On 
Bi 1-8, 8 

B and consequently A (a, 8) and B(a, 8) will be nearly equal to ——— and al 
a = @& 

respectively. If fractional observations were possible, that is to say, if the num- 

Py 
e . . m . 
ber of observations were a continuous variable, —“ would also be a continuous 
’ 


2 
Om 


function of m and consequently A(a, 8) and Ba, 8) would be exactly equal to 
Ls and aie —, respectively. Thus, we have inequalities in (3.24) and (3.25) 
a l—a 
instead of equalities merely on account of the fact that the number m of observa- 
tions is discontinuous, i.e., m can take only integral values. 
Hence for all practical purposes the following procedure can be adopted: To 
construct a sequential test such that the probability of an crror of the first kind does 
not exceed a and the probability of an error of the second kind does not exceed B, put 











Vs 
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A= _—- and B = . ae and carry out the sequential test as defined by the in- 
- wi 
equalities (3.8), (3.9) and (3.10). 
In most practical cases the calculation of the exact values A(a, 8) and B(a, 8) 


will be of little interest for the following reasons: When A = a(a, 8) = ae 
Q 








and B = b (a, B) = —o— , the probability a’ of an error of the first kind cannot 
exceed a and the probability 6’ of an error of the second kind cannot exceed 8, 
except by a very small quantity which can be neglected for practical purposes. 
Thus, for all practical purposes the use of a(a, 8) and b(a, 8) instead of A(a, B) 
and B(a, 8) will not decrease the strength of the sequential test. The only 
possible disadvantage from the substitution is that it may increase the expected 
number of trials necessary for a decision. Since the discrepancy between A (a, 8) 
and B(a, 8) on the one hand and a(a, 8) and b(a, 8) on the other, arises only 
from the discontinuity of the number m of observations, it is clear that the in- 
crease in the expected number of trials caused by the use of a(a, 8) and b(a, 8) 
will be slight. This slight increase, however, cannot be considered entirely a 
loss for the following reason: if a(a, 8) > A(a, 8) or b(a, B) < Bla, B), then we 
can sharpen the inequality (3.30) to a’ + B’ <a+ ,. Hence by using a(a, 8) 
and b(a, 8) we gain in strength. 

The fact that for practical purposes we may put A = a(a, 8) and B = 
b(a, 8) brings out a surprising feature of the sequential test as compared with 
current tests. While current tests cannot be carried out without finding the 
probability distribution of the statistic on which the test is based, there are no 
distribution problems in connection with sequential tests. In fact, a(a, 8) and 


Pim 


Pom 
of the problem without solving any distribution problems. Distribution prob- 


lems arise in connection with the sequential process only if it is desired to find the 
probability distribution of the number of trials necessary for reaching a final 
decision. (This subject is discussed later.) But this is of secondary importance 
as long as we know that the sequential test on the average leads to a saving in 
the number of trials. 

3.4. Probability of accepting Ho (or H,) when some third hypothesis H 7s true. 
In Section 3.2 we were concerned with the probability that the sequential prob- 
ability ratio test will lead to the acceptance of Ho (or H,) when Hp or A, is true. 
Since in Part II we shall admit an infinite set of alternatives, and since this is 
the practically important case, it is of interest to study the probability of accept- 
ing Hy (or Hy) when any third hypothesis H, not necessarily equal to Ho or H,, 
is true. Let H be the hypothesis that the distribution of X is given by f(z). 
If f(x) is equal to fo(x) or fifa) we have the special case discussed in Section 3.2. 
In what follows in this and the subsequent sections any probability relationship 


b(a, 8) depend on a and @ only, and the ratio can be calculated from the data 
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will be stated on the assumption that H is true, unless a statement to the con- 
trary is explicitly made. Denote by y the probability that the sequential prob- 
ability ratio test will lead to the acceptance of H,.” Clearly, if H = Ho, then 
y = aandif H = H,,theny = 1 — B. 

The probability y can readily be derived on the basis of the general theory of 

; ; . fy (a i) : 
cumulative sums given in [4]. Denote log aa by z:. Then {z;} (¢@ = 2, ---, 
OL; 

ad inf.) is a sequence of independent random variables each having the same dis- 
tribution. Denote by Z; the sum of the first 7 elements of the sequence {z;} i.e., 


(3.31) ZpeHeater-t+2; (j = 1, 2, --- , ad inf.) 


For any relation R we shall denote by P(R) the probability that R holds. For 
any random variable Y the symbol EY will denote the expected value of Y. 
Let n be the smallest positive integer for which either Z, > log A or Z, < log B 
holds. Iflog B < Z» < log A holds for m = 1, 2, --- , ad inf., we shall say that 
n = ©, Obviously, n is the number of observations required by the sequential 
probability ratio test. As we have seen in Section 3.3, in practice we shall put 


A = a(a, B) = _—# and B = D(a, B) = 2 a Since B must be less than A, 
as a 
1-8 B 


we shall consider only values a and 8 for which —— oS a This inequality 
is equivalent to a + 8 < 1, which in turn implies that B < landA>1. Thus, 
in all that follows it will be assumed that A > 1 and B < 1. We shall also 
assume that the variance of z; is not zero. 

According to Lemma 1 in [4] the relation P(n = «) = Oholds. Hence, the 
probability is equal to one that the sequential process will eventually terminate. 
This implies that the probability of accepting Ho is equal to 1 — y. 

Let z be a random variable whose distribution is equal to the common dis- 
tribution of the variates z; (i = 1, 2,--- ,adinf.). Denote by g(t) the moment 
generating function of z, 1.e., 





g(t) = Ee”. 


It was shown in [4] that under very mild restrictions on the distribution of z 
there exists exactly one real value h such that h ¥ Oandg(h) = 1. Furthermore, 
it was shown in [4] (see equation (16) in [4]) that 

(3.32) “Ec™™ = 1, 


Let E* be the conditional expected value of e”™" under the restriction that Ho 
is accepted, i.e., that Z, < log B, and let E** be the conditional expected value 
of e7"" under the restriction that H, is accepted, i.e., that Z, > log A. Then we 
obtain from (3.32) 


(3.33) (1 — y)E* + yE** = 1 


10 The probability that Ho will be accepted is equal to 1 — vy, as will be seen later. 
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Solving for y we obtain 


1 — E* 
(3.34) Y = pe _ EF’ 
If both the absolute value of Ez and the variance of z are small, which will be the 
case when f,(x) is near fo(x), E* and E** will be nearly equal to B” and A’, re- 
spectively. Hence, in this case a good approximation to vy is given by the ex- 
pression 


h 
(3.35) 1-7. 
It is easy to verify thath = 1if H = Hy), andh = —1ifH = H,. The differ- 
ence Y — y approaches zero if both the mean and the variance of z converge to 
zero. 

To judge the goodness of the approximation given by 7, it is desirable to de- 
rive lower and upper limits for y. Such limits for y can be obtained by deriving 
lower and upper limits for E* and E**. First we consider the case when h > 0. 
Let ¢ be a real variable restricted to values >1, and let p be a positive variable 
restricted to values <1. For any random variable Y and any relationship R 
we shall denote by E(Y | R) the conditional expected value of Y under the re- 
striction that R holds. It was shown in [4] that the following inequalities hold:” 





(3.36) B {etb. cE G le < >) < E* < B’ (h > 0) 
t 
and 
(3.37) A’ sk < & {lub. pE (c |e" > ‘y (h > 0). 
p p 


The symbol g.].b. stands for the greatest lower bound with respect to ¢, and the 
t 
symbol ].u.b. stands for least upper bound with respect to p. Putting 
p 


(3.38) e-L.b. tE G le" < *) = 7 

and 

(3.39) lu.b. pH G ie” > *) = 6, 
p 


the inequalities (3.36) and (3.37) can be written as 
(3.40) B'n < E* < B’ (h > 0) 


11 See relations (23) and (26) in [4]. The notation used here is somewhat different from 
that in [4]. 
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and 
(3.41) A” < E* < A's (h > 0). 

Since B < land A > 1, we see that E* < 1 and E** > 1lifh >0. From 
this and the relations (3.34), (3.40) and (3.41) it follows easily that 
1 — B’ 1 — nB" 


_— sar — BS 7S ge pe 


(h > 0) 


If h < 0, limits for y can be obtained as follows: Let z’ = —z, - 


Then h’ = —h > Oandy’ = 1-—y. Thus, according to (3.42) we have 


1 
A’ 


nyh’ "7 BN\h’ 

(3.43) esate eee.) 

6(A’)” — (B’) (A’)" — 7'(B’) 
where 6’ and »’ are equal to the expressions we obtain from (3.38) and (3.39), 
respectively, by substituting h’ for h and 2’ for z. Since n and 6 depend only on 
the product hz = h’z’, we see that 6’ = 6 and 7’ = n. Hence, we obtain from 
(3.48) 

1 — A’ 1 — A" 

(3.44) eat) - +S gm (h < 0) 
where 6 and 7 are given by (3.38) and (3.39), respectively. 

In Section 3.5 we shall calculate the value of 7 and 6 for binomial and normal 
distributions. If the limits of y, as given in (3.42) and (3.44), are too far apart, 
it may be desirable to determine the exact value of y, or at least to find a closer 
approximation to y than that given in (3.35). <A solution of this problem is 
given in [4] (see section 7 of that paper). There the exact value of y is derived 
when z can take only a finite number of integral multiples of a constant d. If z 
does not have this property, arbitrarily fine approximation to the value of + 
can be obtained, since the distribution of z can be approximated to any desired 
degree by a discrete distribution of the type mentioned before if the constant d 
is chosen sufficiently small. The results obtained in [4] can be stated as follows: 
There is no loss of generality in assuming that d = 1, since the quantity d can 
be chosen as the unit of measurement. Thus, we shall assume that z takes only 
a finite number of integral values. Let g: and g2 be two positive integers such 
that P(z = —g,) and P(z = ge) are positive and z can take only integral values 
> —g, and <g.. Denote P(z = 2) by h;. Then the moment generating 
function of z is given by 

92 
g(t) = >> h;e''. 
i=—g) 
Put u = e' and let uw, --- u, be the g = g; + g2 roots of the equation of g-th 
degree 


(3.45) > hw = 1. 


i=—g) 
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Denote by [a] the smallest integer > log A, and by [b] the largest integer < log B. 
Then Z, can take only the values 


(3.46) [b] — gi + 1, [6] — gi + 2, ---, (6), lal, [a] + 1, ---, [a] + ge — 1. 


Denote the g different integers in (3.46) by c1, --- , ¢, , respectively. Let A be 
the determinant value of the matrix || u{’ || (¢, 7 = 1, --- , g) and let A; be the 
determinant we obtain from A by substituting 1 for the elements in the j-th 
column. Then, if A ¥ 0, the probability that Z, = c; is given by 

Aj 


(3.47) P(Z, = ¢;) = 
Hence 
(3.48) y = Plan > lal) = 


where the summation is to be taken over all vaues of j for which c; > [a]. 

3.5. Calculation of 5 and n for binomial and normal distributions. Let X be a 
random variable which can take only the values 0 and 1. Let the probability 
that X = 1 be p; if H; is true (¢ = 0, 1), and pif Histrue. Denote 1 — pbyq 
and 1 — pi by qi (¢ = 0,1). Then fi(1) = pi ;f:(0) = qi, f(1) = pandf(O) = ¢. 
It can be assumed without loss of generality that p: > po. ‘The moment generat- 
ing function of z = log*—— is given by 


fo(x) 


_ filx) ; eu Pi r (41 ' 
om e(ha) ~ o(®) . a(@). 


Let h # 0 be the value of ¢ for which g(h) = 1, ie., 
Pi ’ qi , 
i ny = 4, 
, (2) -s (*) 


h 
First we consider the case when h > 0. It is clear that e” = A > 1 im- 
0 


oe ‘ ae 7 yah . po a fi) , a Pi F x 
plies that « = 1. Hence ec > 1 implies that e" = [*——= }) = |—])}. From 
fo(1) Po 
this and the definition of 6 given in (3.39) it follows that 


h 
(3.49) $= @) (h > 0). 


h 
Similarly, the inequality e” < 1 implies that e” = (“) . From this and the 
qo 


definition of » given in (3.38) it follows that 


h 
(3.50) y= (#) (h > 0). 
go 
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If h < 0, it can be shown in a similar way that 


h 
(3.51) $= (“) (h < 0) 
and 
(3.52) n= (=) (h <0). 


Now we shall calculate the values of 6 and 7 if X is normally distributed. Let 











Be ans 
fi(x ” vie — (@ = 0, 1) 





(3.53) 







and 



















1 
! oa 2 e-0* 
(3.54) f(x) = \/2n © ° 
We can assume without loss of generality that 0 = —Aand 6; = A where A > 0, 


since this can always be achieved by a translation. Then 


S = filz) vue ' 
(3.55) z = log fila) ~ 2Ar. 
The moment generating function of z is given by 
(3.56) g(t) eis e2heeten2e? 


Hence 
(3.57) 


Substituting this value of h in (3.38) and (3.39) we obtain 


(3.58) 5 = lu.b. pE (oem > s) 
p Pp 

and 

(3.59) n = g.l.b. ca (e™ eo < :): 
. 3 ‘ 


For any relation R let P*(R) denote the probability that the relation R holds 
calculated under the assumption that the distribution of x is normal with mean 
6 and variance unity. Furthermore, let P**(?) denote the probability that R 
holds if the distribution of z is normal with mean —@ and variance unity. Since 
e ** is equal to the ratio of the normal probability density function with mean 
—6 and variance unity to the normal probability density function with mean @ 
and variance unity, we see that 
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yp (c 
»( 262, —29 
E (. roe” = 
é 


and 
(3.61) E * |e” 


It can easily be verified that the right hand side expressions in (3.60) and 
(3.61) have the same values for 6 = X\as for? = —r. Thus,also 6 and 7 have the 
same values for @ = Xasfor@ = — X._ It will be, therefore, sufficient to compute 
6 and » for negative values of 6. Let @= —AwheredX > 0. First we show that 


Clearly 





(3.62) of = — (l<¢< o). 
a - 

Putting ¢ = - (0 < p < 1) in (8.62) gives 

(3.63) pene lL = pu (o> 3) 


Hence 





2 
> 


Because of the symmetry of the normal distribution, it is easily seen that 


an OED) an OED, 


u.b. 4 -u.b. of 
aa ore ar 
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Hence 


(3.65) = 


o|— 


‘ 1 — 
Now we shall calculate the value of 6. Denote val edt byG(a«). Then 
at dz 


pt ( 1) = pe (2x2 > log ‘) 
- p 
Ld 1 
— p** oe > 25:  _ = ‘ [= > r 
P (: 2 log ‘) G (2 log : s) 
Similarly 


oe (> 1) pe(p >t 1) - he * 
FE (. > 1) I (= > J tog! G ETH ‘ 


1 1 
Denote » log —by u. Since p can vary from 0 to 1, uw can take any value from 
p 

he 24 u 
Oto «. Since p =e , we have 


IV 












( 
| D** 2Az | } 
| pL (« ~ ‘)| f i - ) 
(3.66) 6 = lub. ? . PF -iubie mt —™\ <u < @). 
? ~- = ] = \ G(u + N)) 
~ p 


We shall prove that 


; 2udr Glu eae r) a 


(3.67) ice” 


x (w) (say) 





is a monotonically decreasing function of u and consequently the maximum is 
at u = 0. For this purpose it suffices to show that the derivative of log x(w) 
is never positive. Now 





(3.68) 






log x(v) = log Gu — A) — log Gu + A) — 2dru. 








d 





Denote ——e ** by P(x). Since G(u) = —(u) it follows from (3.68) that 
V 2r 


du 






’ ad aa _P(u — dr) |, Pw +A) _ 
(3.69) log x(u) Gu + d) 


2X. 
du G(u — dX) 
It follows from the mean value theorem that the right hand side of (3.69) is 









. «pd (P(u)\. . , - 
never positive if - C 5 J is equal to or less than 1 for all values of u. Thus, 
du \G(u 


we need merely to show that 
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d 2 _ &(u)G(u) — G(u)®(u) 
G?(u) a 


_ PW G(u) + Hu) _ Pu) __ Bu) 
G?(u) Gu), G(u) — 


P(u) 


Denote Gu) by y. The roots of the equation y° — uy — 1 = Oare 


-vtvert+s 
5 . 


Hence the inequality y° — uy — 1 < 0 holds if and only if 
eye 73 <y< uit Veta 
2 2 
Since y cannot be negative, this inequality is equivalent to 


- Pu) _ 
(3.71) G(u) P 
Thus we have merely to prove (3.71). We shall show that (3.71) holds for 
all real values of u. Birnbaum has shown [5] that for u > 0 
(3.72) ae sa ae P(u) < Gu). 


Hence 

sola &(u) : 2 = (BSC Mw H+ 4+ 

(3.73) ; < — = —__~_——_ 
(7(u) Vw+4-—Uu 2 

which proves (3.71) for u > 0. Now we prove (3.71) foru <0. Letu = — pv 

where v > 0. Then it follows from (3.73) that 

P(v 2 

B(v) Si ia . 

Ge) ~ V4+e—v? 


Taking reciprocals, we obtain from (3.74) 


(u > 0) 


(3.74) 


ie G(v) V4i+r—v 
(3.76 =. : : 
(3.75) P(v) — 2 


Since 
G(u) a G(v) + 2vP(v) _ Gv) 
P(u) ~ P(v) P(v) 


we obtain from (3.75) 


+ 2 


_ Gu) y Ve+4t3ry Vert 4th 
e 5 > ; 2 _ _— 
(3 76) P(u) — 2 ara 2 
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Taking reciprocals, we obtain 


a Sake Sh A le a Be 2h 

Gu) ~ Ve +4+0 2 2 , 
Hence (3.71) is proved for all values of u and consequently 6 is equal to the value 
of the expression (3.67) if we substitute 0 for vu. Thus, 


_ G(-») 
~ GA) 





(3.77) 


4. The Number of Observations Required by the Sequential Probability 
Ratio Test 


4.1. Expected number of obscrvations necessary for reaching a decision. As 
before, let 


m fil) a jonas 
© fo(x)’ - fo(xi) 


and let 2 be the number of observations required by the sequential test, i.¢e., n is 
the smallest integer for which Z, = 2; + --: +2, iseither >log A or <log B. 
To determine the expected value E(n) of nm under any hypothesis H we shall 
consider a fixed positive integer V. The sum Zy = 2, + --- + 2y can be split 
in two parts as follows 


z = lo 


(¢ = 1,2, --- , ad inf.) 


(4.1) ZN sss ZatZ, 


pot ch : pst ‘e oa ‘ ‘ 
where Z,, = Znar te': tewifn < NandZ, = Zy —Z,ifn > N. Taking 
expected values on both sides of (4.1) we obtain 


(4.2) NEz = EZ, + EZ’,. 


Since the probability that n > N converges to zero as N — , and since 
'z, | < 2(log A + | log B|) ifn > N, it can be seen that 


(4.3) lim [EZ, — E(N — n)Ez] = 0. 
From (4.2) and (4.3) it follows that 
(4.4) EZ, = EnEz . 
Hence 
EZ 
. En = — 
(4.5) n Ez 


Let E*Z,, be the conditional expected value of Z, under the restriction that the 
sequential analysis leads to the acceptance of Ho, i.e. that Z, < log B. Simi- 
larly, let E**Z,, be the conditional expected value of Z, under the restriction that 
H, is accepted, i.e., that Z, > log A. Since is the probability that Z, > log A, 
we have 


(4.6) EZ, = (1 — y)E*Z, + vE™*Z, . 
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From (4.5) and (4.6) we obtain 


_ (1 _ y)E*Z, + yE™*Z,, 


(4.7) En Ee 





The exact value of EZ, , and therefore also the exact value of En, can be com- 
puted if z can take only integral multiples of a constant d, since in this case the 
exact probability distribution of Z, was obtained (see equation (3.47)). If z 
does not satisfy the above restriction, it is still possible to obtain arbitrarily fine 
approximations to the value of EZ, , since the distribution of z can be approxi- 
mated to any desired degree by a discrete distribution of the type mentioned 
above if the constant d is chosen sufficiently small. 

If both | Ez | and the standard deviation of z are small, E*Z, is very nearly 
equal to log B and E**Z,, is very nearly equal to log A. Hence in this case we 
can write 
(4.8) , (1 — y) log B+ vlog A 


ry 
Lz 


To judge the goodness of the approximation given in (4.8) we shall derive lower 
and upper limits for En by deriving lower and upper limits for E*Z, and E**Z,, . 
Let r be a non-negative variable and let 


(4.9) € = Max E(z —r\z>r) 
and 

(4.10) ’ = Min F(z +riz+r <0). 
It is easy to see that 

(4.11) log A < E**Z, < log A +é 
and 

(4.12) log B + & < E*Zn < log B. 


We obtain from (4.7), (4.11) and (4.12) 


(1 — y)(log B + 2’) + vlog A < En < (1 — y) log B + y(log A + &) 








(4.13) Ez Ez 

and if Ez >0 
(1 — y) log B + y(log A + £) <En< (1 — y)(log B + &) + ylog A 

(4.14) Ez Ez 








if Ez < 0. 


4.2. Calculation of the quantities — and &' for binomial and normal distributions. 
Let X be a random variable which can take only the values 0 and 1. Let the 
probability that Y = 1 be p; if H; is true (¢ = 0, 1), and pif H istrue. Denote 
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1 — pbyqand1 — p;byq:(@ = 0,1). Thenfi(l) = p:,f(0) = q:, fC) = p 
and f(0) = qg. It can be assumed without loss of generality that pi > po. It 


is clear that log fiz) > 0 implies that x = 1 and consequently log f(z) = log 
fo(x) fo(x) 
f (1) = log Pi Hence 
fo) Po 
(4.15) & = Max F(z —r|z>r) = log ey 
r Po 
Since log is < 0 implies that ¢ = 0, we have 
(4.16) = MinE(z +r\/z+r <0) = log 2, 
r do 
Now we shall calculate the values — and &’ if X is normally distributed. Let 
] ‘ 
hss eo Ads ag etl = 
fix) = aa (@ = 0,1) (1 > 6) 
and 
l -—\ 2/9 
f(x) = —— ¢ (rv a 
oo V 2r 
We may assume without loss of generality that 4% = —A and 6, = A where 


A > 0, since this can always be achieved by a translation. Then 


~ Ss filx) 
(4.17) z= log F(a) 


% ] 1 2 l " A..¢2 / 
Denote —j=e*” by (rx) and -f e* dtby G(x). Lett = x — 6. 
V 24 A Zr vz 


= ZAz. 


Thenz = 2A(t + 6) and 


Ez —r|z—r 20) =2sE(tt0- fe +0- J > 0) 


2 a 
(4.18) ‘ : . 
A [ ; ; A 
a (¢ — ty) t) dt = —t)G ty) P(t 
Git) Jy. ( Gh) [—G() + O(b)I 

where 

r 
(4.19) ty = — 6. 

sonal ‘ 4 ie a P(t) ; : 
In section 3.5 (see equation (3.70)) it was proved that Gi) ~ fo is a monotoni- 
cally decreasing function of fo. Hence the maximum of E(z — r|z— r= 0) 


is reached for r = 0 and consequently 


_ 2a as oe o(—8) 
(4.20) §= G6) [6G(—6) + &(—6)] = 2A [ + co 
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Now we shall calculate ¢’. We have 
’ = Min F(z + r|z2+r7r <0) = —Max E(-z 
(4.21) 


r | r 
—2A Max E( —1 — —|-z-— >0). 
ax E( * on| ~*~ 2a 0) 


, 
t =—2 € o>—.. r 7 > 
Let r+ @and fy oA +0 hen 


yr 
a oe en 
dA = 0) Ett ty | t to > 0) 
(4.22) ' 
1 “ _— @(t,) 
toe (tt — 2)0@) dt = —&. 
Gb) Seg Oe * aa 


Since this is a monotonically decreasing function of fo , we have 


r | r &(6) 
2 Matt «~~ 2) -g¢~ 2 a0) = 
eam e ( wane a 7,20) Go ° 


From (4.21) and (4.23) we obtain 


#(6) 
4.24) t’ = —2A| —— — 6}. 
4.3. Saving in the number of observations as compared with the current test 
procedure. We consider the case of a normally distributed variate, such that 


—}(x—89)2 


1 
f(x) = o/ae" 


| Mr—61)2 
fil) = Jie orn (0: ~ 6). 


Denote by n(a, 8) the minimum number of observations necessary in the current 
most powerful test for the probabilities of errors of the first and second kinds 
to be a and 8, respectively, or less. 

We shall calculate the number of observations required by the most powerful 
test. It can be assumed without loss of generality that 6 < 6:. According 
to the current most powerful test procedure the hypothesis Ho is accepted if 
& < d and the hypothesis H, is accepted if > d, where & is the arithmetic 
mean of the observations and d is a properly chosen constant. The probability 
of an error of the first kind is given by G[4/n(d — 6)] and the probability of an 
error of the second kind is given by 1 — G[W/n(d — 6;)] where G(t) = 

l 
V 2x 
quantities d and n must satisfy 


(4.25) G[V/n(d — 6)] = a 


2 
| ce” “dx. To equate these probabilities to a and 8, respectively, the 
t 
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and 
(4.26) 1 — G[Vn(d — 4)] = B. 


Denote by do and \, the values for which G(Ao) = a and G(A;) = 1 — B. Then 
we have 


(4.27) 







Vn(d — 4) No 
















and 





(4.28) V/ nid = 6;) = Ay ° 
Subtracting (4.27) from (4.28) we obtain 
From (4.29) 

sia = r= 0)" 
(4.30) n = n(a, B) (6 = 6)?" 


If the expression on the right hand side of (4.30) is not an integer, n(a, 8) is the 
smallest integer in excess. 
‘ -_ : 1-8 
In the sequential probability ratio test we put A = a(a, 8) = ——— and 
Qa 












B = U0(a, B) = or Then the probability of an error of the first (second) 
kind cannot exceed a(8) except by a negligible amount. Let A(a, 8) and 
B(a, 8) be the values of A and B for which the probabilities of errors of the first 
and second kinds become exactly equal to a and 8, respectively. It has been 
shown in Section 3.2 that A(a, 8) < a(a, B) and Bla, B) > bla, B). Thus, the 
expected values £,(n) and E,(n) are only increased by putting A = a (a, 8) and 
B = b (a, 8) instead of A = A (a, 8) and B = B (a, 8). 

Consider the case where | 6; — | is small so that the quantities — and é’ can 
be neglected. Thus, we shall use the approximation (4.8). Sincey = aif H = 
Hy, andy = 1 — git H = H,, we obtain from (4.8) 









a* a* + | b* | 
4.31 E\(n) = ——~ - 8 ——— 
( ) i() E\(z) E,(z) 


and 


ae ay + a* 


4. 2 Ey = es — ee 
(4.32) ‘n) E(—) a B— 





where a* = log a(a, 8) = log fs and b* = log b(a, 8) = log —- Since 
- ‘“ 


(4.33) E\(z) = 4( — 6)” 
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and 


(4.34) Eo(—z) = 3(0 — &)’, 


E 
it follows from (4.30), (4.31) and (4.32) that d ———.. are independent 
nla, n(a, B) 


of the parameters 6) and 6,. 
TABLE 1 
Average percentage saving of sequential analysis, as compared with current most 


powerful test for testing mean of a normally distributed variate 
A. When alternative hypothesis is true: 





| 
| 
| | 


02 | .03 





58 | 61 
B44 |KO 
51 | 0 l53BtCis|CsC 
49 | 50 | 51 
47 49 50 





B. When null hypothesis ts true: 








ol. | 2 | 03 





47 
49 
50 
50 
51 


01 58 
.02 

.03 | 

04 | 62 
05 | 63 


oo ool gi 
co cont q 











The average saving of the sequential analysis as compared with the current 


; E,(n) : : E,(n) 
method is 100 {1 — —~ } per cent if H, is true, and 100 { 1 — ——— \ } per 


n(o, 8) Bini n(a, B) 
1\%7 


cent if H)istrue. In Table 1 the expression 100 (1 + ) is shown in Panel 


‘ Lo(n ‘ 
A, and the expression 100 ¢ we J in Panel B, for several values of a and £. 
Wa 


Because of the symmetry of the normal distribution, Panel B is obtained from 
Panel A simply by interchanging a and 8. 
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As can be seen from the table, for the range of a and 8 from .O1 to .05 (the 
range most frequently employed), the sequential process leads to an average 
saving of at least 47 per cent in the necessary number of observations as com- 
pared with the current procedure. The true saving is slightly greater than shown 
in the table, since E;(n) calculated under the condition that A = a (a, 8) and 
B = b (a, 8) is greater than E£;(n) calculated under¢the condition that A = A 
(a, 8) and B = B (a, B). 

4.4. The characteristic function, the moments and the distribution of the number 
of observations necessary for reaching a decision. It was shown in [4] (see equa- 
tion (15) in [4]) that the following fundamental identity holds 


(4.35) Ete lg) "} = 1 (g(t) = Ee*') 


for all points ¢ of the complex plane for which ¢(t) exists and | g(t) | > 1. The 
symbol n denotes the number of observations required by the sequential test, 
1.e., n is the smallest positive integer for which Z, is either > log A or < log B, 
and g(t) denotes the moment generating function of z. 

On the basis of the identity (4.35) the exact characteristic function of n is 
derived in section 7 of [4] in the case when z can take only integral multiples of 
a constant. If the number of different values which Z, can take is large, the 
calculation of the exact characteristic function is cumbersome, because a large 
number of simultaneous linear equations have to be solved. However, if | Ez | 
and o. are small so that | Z, — log A | (when Z, > log A) and | Z, — log B | 
(when Z, < log B) can be neglected, the calculation of the characteristic func- 
tion is much simpler, as was shown in [4]. We shall briefly state the results 
obtained in [4]. Let h be the real value ¥ 0 for which g(h) = 1. Furthermore 
let ¢ = t(7) and t = t2(7) be the roots of the equation in ¢ 


—log g(t) = 7 
such that lim ¢;(7) = 0 and lim f.(7) = h. 

7T=0 7T=0 
istic function of the conditional distribution of n under the restriction that Z, > 
log A, and yor) the characteristic function of the conditional distribution of n 
under the restriction that Z, < log B. Then, if |Z, — log A | (when Z, > 
log A) and | Z, — log B | (when Z, < log B) can be neglected, (7) and Wo(7) are 
the solutions of the linear equations 


Finally, let ¥i(7) the character- 


(4.36) Wwi(r)A™” + (1 — yyo(r) BO” = 1 
and 
(4.37) Wilr) A? + (1 — y)o(7) BP? = 1 
where 

1 — BY 


— Ps 4) = —r- 
y = P(Z, > log A) TU mG RB" 


The characteristic function of the unconditional distribution of n is 


(4.38) W(7) = alr) + (1 — y)ye2(7). 
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As an illustration we shall determine y;(7), 2(7) and ¥(7) when z has a normal 
distribution. Then we have 








2 
Cz 


— log g(t) = —(Ez)t — at = 7, 








2Ez 


= —" 5) 


Oo; 















ii(7) = 


Of es 


a to 


(—Ez + V (Ez? 
(4.40) 


to(r) = —3 (—Ez — V(Ez)® — 2077). 


S,.| 


n to 


From (4.36), (4.37) and (4.38) we obtain 
BY? — B" 

A” B — A®? Be" 
A”? — 







(4.41) ywil7) = 

















A92 


(4.42) (1 — y)¥o(r) = 
and 


4” 4. B22 — A’ — BR" 


4.43 tT) = 
( ) v( ) A?! B® at A? B™ 









where 


1 aia 
— gi = <= (—Ez + (Ez)? — 227) 
and 
(4.45) go = <3 (—Ez — V (Ez)? — 2037). 


For any positive integer r the r-th moment of n i.e., E(n’) is equal to the r-th 
derivative of ¥(7) taken at 7 = 0. Let E*(n’) be the conditional expected value 
of n’ under the restriction that Z, < log B, and let E**(n') be the conditional 


expected value of »’ under the restriction that Z, > log A. Then 


(4.46) E*(n’) = 





r pol . i di(r) | 
d y2(7) and = E**(n’) = d valr) ‘ 
dr” T=0 dr’ r=(0) 
- l'yi(7) | 
It may be of interest to note that dyi(7) 
dr’ lr=0 
moments of n can be obtained from the identity (4.35) directly by successive 
differentiation. In fact, the identity (4.35) can be written as (neglecting the 
excess of Z, over the boundaries log A and log B) 


(4.47) yA'Ys[—loge(t)] + (1 — y)BYo[—loge(t)] = 1. 


(k = 1, 2) and therefore also the 
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Taking the first 7 derivatives of (4.47) with respect to ¢ at t = O and?t = 
d’y.(7) L 
“— we 


1,2;7 = 1, --- ,r) from which these unknowns can be determined. For example, 
da() | 


dr \rmo 


we obtain a system of 2r linear equations in the 27 unknowns 


(k = 1, 2) can be determined as follows: Taking the first derivative 


of (4.47) with respect to-¢ and denoting a by vi" (7) we obtain 


y(log A) A‘ y,[—log ¢(t)] — yA’ os t?[—log ¢(t)] 


+ (1 — y)(log B)B‘ y-[—log ¢(t)] 


¢ en 


— (1 — 7)B y ve [— log ¢(t)] = 0. 


Putting t = 0 andt = h we obtain the equations 


(4.49) ylog A — y co Hi(0) + (L — v) log B - (1 — 1) SH ae (0) =0 


and 


y(log A)A' -_ yA" ¢ a {)Q) 
(4.50) 
hg ’(h) 


+ (1 — y)(log B)B" — (1 — 7)B ath) 


¥3(0) =0 


from which ¥;" (0) and ws" (0) can be determined. 

The distribution of n can be obtained by inverting the characteristic function 
of ¥(7). This was done in [4] (neglecting the excess of Z, over log A and log B) 
in the case when z is normally distributed. The results obtained in [4] can be 
briefly stated as follows: If B = 0, or if B > O and A = ~, the distribution 
of n is a simple elementary function. If B = 0 and Ez > 0, the distribution of 


1 
2 ° ° 
m = > (Ez)’n is given by 
20; 


~ r as s —c2/4m—m+c < 5 
(4.51) F(m) dm or(3)m' ° dm (0O<m< ~) 


where 


1 
(4.52) eS (Ez) log A. 


1 
If B > 0,A = ~ and Ez < 0 the distribution of m = 53 


ie de ° 
3 (Ez)n is given by the 
2c: 


, = 
expression we obtain from (4.51) if we substitute 3 (Ez) log B for c. 


IfiB>OandA < o, the distribution of m is given by an infinite series where 
each term is of the form (4.51) (see equation (76) in [4]). 
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Since m is a discrete variable, it may seem paradoxical that we obtained a 
probability density function for m. However, the explanation lies in the fact 
that we neglected the excess of Z, over log A and log B which is zero only in the 
limiting case when Ez and o; approach zer@ 

The distribution of m given in (4.51) can be used as a good approximation 
to the exact distribution of m even if B > 0, provided that the probability that 
Z, = log A is nearly equal to 1. 

It was pointed out in [4] that if | Hz | and c, are sufficiently small, the distribu- 
tion of n determined under the assumption that z is normally distributed will 
be a good approximation to. the exact distribution of n even if z is not normally 
distributed. 

4.5. Lower limit of the probability that the sequential process will terminate with 
a numbér of trials less than or equal to a given number. Let Pi(no) be the prob- 
ability that the sequential process will terminate at a value n < 7, calculated 
under H; (¢ = 0, 1). Let 


(4.53) Pj(m) = Po , Za < log B| 
and 

(4.54) Pi(m) = P, Z Za > log a]. 
It is clear that ; 








(4.55) Pin) < Pino) (i = 0, 1). 


no 
For calculating P;(m9) we shall assume that no is sufficiently large so that » Re 
a=l1 


can be regarded as normally distributed. Let G(A) be defined by 


1 Pia 
(4.56) G(A) = wal ee dt. 
™ Jy 
Furthermore, let 


= = ) ; (z 
(4.57) ia) « A — ee 


J/g 01(2) 


















log B — no E,(z) 









(4.58) Ao(No) = voaa 

where o;(z) is the standard deviation of z under H;. Then 
(4.59) P,(n) = GlAa(no)] 

and - 

(4.60) Po(mo) = 1 — GfAro(no)]. 
Hence we have the inequalities 


(4.61) Py(no) > GlAr(no)] 
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and 
(4.62) Po(no) > 1 — G{ro(no)]. 


[ 


Putting log A = log =. and lég B = log = , Table 2 shows the values 
o va 


of Pi(mo) and Po(no) corresponding to different pairs (a, 8) and different values 
of no. In these calculations it has been assumed that the distribution under 
His anormal distribution with mean zero and unit variance, and the distribution 
under H; is a normal distribution with mean @ and unit variance. For each pair 
(a, 8) the value of 6 was determined so that the number of observations required 
by the current most powerful test of strength (a, 8) is equal to 1000. 


TABLE 2 


Lower bound of the probability* that a sequential analysis will terminate within 
various numbers of trials, when the most powerful current 
test requires exactly 1000 trials 


| a= Oland@= .01 | a= .Oland@= .6 | a= Sands = .6 





Number of 





trials Alternative Null |Alternative Null — Alternative Null | 
| hypothesis | hypothesis | hypothesis | hypothesis | hypothesis hypothesis 
true | true | true true true true 
1000 = |S. 910 910 | .799 | .891 | .773 | .773 
1200 } .950 | .950 | .871 | .932 | .837 | .837 
1400 972 | .972 | .916 | .957 | .883 | .883 
1600 | .985 | .985 | .946 | .972 915 | .915 
1800 | .991 | .991 | .965 982 .938 | .938 
2000 995 | .995 | .977 | .989 | .955 | 955 
2200 | .997 | .997 985 | .993 | .967 | .967 
2400 / .999 | .999 | .990 .995 976 | .976 
2600 | 999 .999 | .994 | .997 | .982 .982 
2800 | 1.00 | 1.00 | .996 | .998 | .987 | .987 
3000 | 1.00 | 1.00 | .997 | .999 | .9909 | .990 


* The probabilities given are lower bounds for the true probabilities. They 
relate to a test of the mean of a normally distributed variate, the difference be- 
tween the null and alternative hypothesis being adjusted for each pair of values 
of « and 8 so that the number of trials required under the most powerful current 
test is exactly 1000. 


4.6. Truncated sequential analysis. In some applications a definite upper 
bound for the number of observations may be desirable. Thus, a certain 
integer no is chosen so that if the sequential process does not lead to a final 
decision for n < no, a new rule is given for the acceptance or rejection of Ho 
at the stage n = n. 

A simple and reasonable rule for the acceptance or rejection of Hp at the stage 


no no 


n = Mm can be given as follows: If } x Za < 0 we accept Ho and if z s. > 


a=1 a=l 
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we accept H,. By thus truncating the sequential process we change, however, 
the probabilities of errors of the first and second kinds. Let a@ and 6 be the 
probabilities of errors of the first and second kinds, respectively, if the sequential 
test is not truncated. Let a(no) and B(No) be the probabilities of errors of the 
first and second kinds if the test is truncated at n = no. We shall derive upper 
bounds for a(n) and B(no). 

First we shall derive an upper bound for a(n). Let po(no) be the probability 


(under the null hypothesis) that the following three conditions are simultaneously 
fulfilled: 


(i) log B < >> za < log A forn = 1,---,nm — 1 
a=l 

(ii) 0 < dizu < log A 
a=l1 


(iii) continuing the sequential process beyond mp, it terminates with the 
acceptance of Ho. 
It is clear that 


(4.63) a(n) < a + po(no). 


Let po(mo) be the probability (under the null hypothesis) that 0 < 2 a < 

log A. Then obviously - 
po(o) < fo(no) 

and consequently 

(4.64) a(no) < a + pono). 


Let pi(mo) be the probability under the alternative hypothesis that the fol- 
lowing three conditions are simultaneously fulfilled: 


(i) log B < >> za < log A forn = 1,---,m-—1 
a=l1 
a) 

(ii) log B < D> za < 0 
a=l 


(iii) continuing the sequential process beyond mo, it terminates with the 
acceptance of H,. 
It is clear that 


(4.65) Biro) < B + piln). 
Let fi(mo) be the probability (under the alternative hypothesis) that log B < 


no 


Zz Za < 0. Then pi(no) < fi(nmo) and consequently 


a=l 


(4.66) B(no) < B + Apilno). 
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Let 
—N E,(2) 
"1 9/19 o0(2) 
a log A — mo E(z) - —m E;(z) ” log B — mE,(z) 
wien V1 50(z) , _ V1 01(2)’ ee V 10 01(2) 


where o;(z) is the standard deviation of z under H; (¢ = 0,1). Then 


(4.67) po(no) = G(v1) — G(v2) 

and 

(4.68) pi(ro) = G(r) — G(vs). 

From (4.64), (4.66), (4.67) and (4.68) we obtain 
(4.69) a(n) < a + G(r) — G(r») 
and 

(4.70) B(no) < B+ G(s) — G(r3). 


The upper bounds given in (4.69) and (4.70) may considerably exceed a(no) 
and B(no), respectively. It would be desirable to find closer limits. 

Table 3 shows the values of the upper bounds of a(n) and 8(n) given by for- 
mulas (4.69) and (4.70) corresponding to different pairs (a, 8) and different values 





of n>. In these calculations we have put log A = log Ls. log B = log i 
a —a 


and assumed that the distribution under Hp is a normal distribution with mean 
zero and unit variance, and the distribution under H, is a normal distribution 
with mean @ and unit variance. For each pair (a, 8) the value of @ has been 
determined so that the number of observations required by the current most 
powerful test of strength (a, 8) is equal to 1000. 

It seems to the author that the upper limits given in (4.69) and (4.70) are 
considerably above the true a(n) and B(no) respectively, when 7%» is not much 
higher than the value of n needed for the current most powerful test. 

4.7. Efficiency of the sequential probability ratio test. Let S be any sequen- 
tial test for which the probability of an error of the first kind is a, the prob- 
ability of an error of the second kind is 8 and the probability that the test 
procedure will eventually terminate is one. Let S’ be the sequential prob- 
ability ratio test whose strength is equal to that of S. We shall prove that the 
sequential probability ratio test is an optimum test, ie., that Fifa |S) > 
E;(n | 8’) (¢ = 0, 1), if for S’ the excess of Z, over log A and log B can be neg- 
lected. This excess is exactly zero if z can take only the values d and —d 
and if log A and log B are integral multiples of d. In any other case the excess 
will not be identically zero. However, if | Zz and c: are sufficiently small, 
the excess of Z, over log A and log B is negligible. 

For any random variable uw we shall denote by E;(u| 8) the conditional 
expected value of u under the hypothesis H; (¢ = 0, 1) and under the restriction 
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that Ho is accepted. Similarly, let E*(u | S) be the conditional expected value 
of wu under the hypothesis H; (¢ = 0, 1) and under the restriction that H, is 
accepted. In the notations for these expected values the symbol S stands for 


TABLE 3 


Effect on risks of error of truncating* a sequential analysis at a predetermined 
number of trials 























a= 0landé= .01 | a= Oland8= .05 | a= .05and8 = .05 

“irials” | pobid'ot | bowed ot | boul ot | bode ot | bottd ot | bod of 

effective | effective | effective | effective | effective | effective 

a B a B a B 

1000 020 | .020 | .033 | .070 | .095 | .095 
12000 =| «015 |S. O15 024 | .063 | .082 | .082 
1400 013 | .013 019 | .058 | .072 | .072 
1600 | .012 | .012 016 | .055 | .066 | .066 
1800 O11 | .O11 014 | .053 | .062 | .062 
2000 | .010 | .010 012 | .052 | .058 | .058 
2200 010 | 010 | .012 | .051 | 056 | .056 
2400 | .010 | .010 | .O11 051 | .055 | .055 
2600 | .010 | .010 | .011 | .051 | .053 | .053 
2800 | .010 | .010 010 | .050 | .053 | .053 
3000 =§ — 010 «|_—s «010 010 | .050 | .052 | .052 








* If the sequential analysis is based on the values a and 6 shown, but a deci- 
sion is made at mo trials even when the normal sequential criteria would require 
a continuation of the process, the realized values of a and 8 will not exceed the 
tabular entries. The table relates to a test of the mean of a normally distributed 
variate, the difference between the null and alternative hypotheses being ad- 


justed for each pair (a@.8) so that the number of trials required by the current 
test is 1000. 


the sequential test used. Denote by Q;(S) the totality of all samples for which 
the test S leads to the acceptance of H;. Then we have 


- * (Pini c) _ PilQ(S)] _ _ 8 
aii - (2/3) PlQ(S)] 1—a 


(= s) _ PQ(S)] _ 1-86 


Pon ~ PAQ(S)ji 


(4.73) gt (Bs) = RaQ) 1 = 
Ey* 











0 
(4.72) 








Pin ~~ P3[Qo(S)] B 
and 


(4.74) 








Pon | ) P. [Q,(S)] a 
Pin 
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To prove the efficiency of the sequential probability ratio test, we shall first 
derive two lemmas. 
Lema 1. For any random variable u the inequality 


(4.75) e”* < Be” 
holds. 

ProoF: Inequality (4.75) can be written as 
(4.76) 1 < Ee” 


where u’ = u — Eu. Lemma 1 is proved if we show that (4.76) holds for any 
random variable u’ with zero mean. Expanding e“ in a Taylor series around 
u’ = 0, we obtain 


(4.77) e“ =1+uw + due"? where 0 < &u’) < w. 
Hence 
(4.78) Ee“ = 1+ 48fue”] > 1 


and Lemma 1 is proved. 
Lemma 2. Let S be a sequential test such that there exists a finite integer N with 
the property that the number n of observations required for the testis < N. Then 


| 
E; (ioe Ps s) oe 
Ex(n|S) = — ies - ew 


The proof is omitted, since it is essentially the same as that of equation (4.5) 
for the sequential probability ratio test. 

On the basis of Lemmas 1 and 2 we shall be able to derive the following 
theorem. 

THEOREM. Let S be any sequential test for which the probability of an error 
of the first kind is a, the probability of an error of the second kind is B and the prob- 
ability that the test procedure will eventually terminate is equal to one. Then 

] 


(4.80) E,(n |S) = —— ja — a) log xs + a log —_ 
Ey(z2) 1 — @ ‘ 





(4.79) 


and 





E\(2) l-—a _ 


Proor: First we shall prove the theorem in the case when there exists a finite 
integer N such that n never exceeds N. According to Lemma 2 we have 


|S 1 n | Y 
Ey(n | S) = E,( 2) Ey (102 = ; ) 
0\e On 


1 " — n ’ 
= aD ja ae (toe = s) + aE)" Ce “a s)| 


(4.82) 
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“I 


and 


E,(n|S) = RS BE (toe < | 8) 
On 


1 Pin * * Pin | )| 
EY ‘ - — ; 
~ Ex(e) E (105 2 Pon s) oo - (10g Don | 


From equations (4.71)—(4.74) and Lemma 1 we obtain the inequalities 


(4.83) 

















(4.84) E* (oe 2 Pin | s) < log —8 
Pon 1 — a 

(4.85) B3* (log > s) <te—* 
Pon Q@ 

. Pen | _ pe Pin sle< l—a 
(4.86) BY (tog . s) = Ey Ce = ) < log 3 
and 
(4.87) EY *(loe ! | s) = —B3* (log s) < log = 


Since E,(z) < 0, se salle from (4.82), (4.84) and (4.85). Similarly, since 
E,\(z) > 0, (4.81) follows from (4.83), (4.86) and (4.87). This proves the theo- 
rem when there exists a finite integer N such that n < N. 

To prove the theorem for any sequential test S of strength (a, 8), for any 
positive integer V let Sy be the sequential test we obtain by truncating S at the 
N-th observation if no decision is reached before the N-th observation. Let 
(ay , By) be the strength of Sy. Then we have 





(4 88) Ey(n | S) > Ev(n | Sy) > (1 a ay) log By + Qy log Lan 
E ie 1— Qn 


and 


(4.89) E,(n|S) > E,(n| Sx) > KD E log —" a sa -eie 
41\ ay 
Since lim ay = a and lim By = 8, inequalities i and (4.81) follow from 
N=0 N= 


(4.88) and (4.89). Hence the proof of the theorem is completed. 

If for the sequential probability ratio test S’ the excess of the cumulative sum 
Z,, over the boundaries log A and log B is zero, Eo(n | S’) is exactly equal to the 
right hand side member of (4.80) and E,(n | S’) is exactly equal to the right hand 
side member of (4.81). Hence, in this case S’ is exactly an optimum test. 
If both | Ez : are small, also the expected value of the excess over the 
boundaries will be small and, therefore, Eo(n | S’) and E,(n | 8S’) will be only 
slightly larger than the right hand members of (4.80) and (4.81), respectively. 
Thus, in such a case the sequential probability ratio test is, if not exactly, very 
nearly an “ome test.’ 





12 The author conjectures that the sequential probability ratio test is exactly an opti- 
mum test even if the excess of Z, over the boundaries is not zero. However, he did not 
succeed in proving this. 
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Part II. SequentiaAL Test oF A SIMPLE OR ComposiTE HyporuHesis AGAINST 
A Ser or ALTERNATIVES 


In Part I we have dealt with the problem of testing a simple hypothesis Ho 
against a single alternative H,. Here we shall consider the problem of testing 
a simple or composite hypothesis against a set of infinitely many alternatives. 
By a simple hypothesis we mean a hypothesis which specifies uniquely the 
probability distribution of the random variable x under consideration. A 
hypothesis is called composite, if it is not simple. 


5. Test of a Simple Hypothesis Against One-sided Alternatives 


5.1. General remarks. Let f(x, 0) be the probability density function of a 
random variable X, where @ is an unknown parameter. Suppose that it is re- 
quired to test the simple hypothesis that 6 = 4 and that the alternative values 
of 6 are restricted to values @ > 6). Assume that it is desired to have a sequen- 
tial test such that the probability of an error of the first kind is equal to a given a. 

The probability of an error of the second kind is no longer a single value, but 
is a function of the true value of 6. If f(x, 6) is a continuous function of x and 
6, the probability of an error of the second kind will be arbitrarily near 1 — a 
if the true value of 6 is sufficiently near 4). Hence, if @ is small, the prob- 
ability of an error of the second kind is necessarily large when the true value of 6 
is very near 6). In most practical applications we do not care if the prob- 
ability of an error of the second kind is high when the true value of @ is very 
near 6) , since in this case the error committed by accepting 4 is usually of very 
little importance. However, there will be a value 6; > 6) such that we wish the 
probability of an error of the second kind to be less than or equal to a given small 
positive value 8 whenever the true value of @ is greater than or equal to 4, . 

In this case we can proceed as follows: Consider the single alternative hypothe- 
sis H, that 6 = 6,. Construct a sequential test for testing @ = 6 against the 
single alternative H, such that the probability of an error of the first kind is a 
and the probability of an error of the second kind, i.e., the probability of ac- 
cepting 0 when 6; is true, is 8. If this sequential test has the further property 
that the probability of an error of the second kind is less than or equal to 8 
whenever the true value of @ is greater than 61, then this sequential test pro- 
vides a satisfactory solution of the problem of testing the hypothesis that 6 = 0 
against the set of alternatives 6 > .6o. 

In most of the important cases occurring in practice, such as when X has a 
normal, binomial, or Poisson distribution, etc., the sequential probability ratio 
test for testing the hypothesis that 6 = 6) against a single alternative 6; (0: > 6) 
satisfies the condition that the probability of an error of the second kind is a 
monotonically decreasing function of in the domain @ > @). Thus, in all these 
cases the sequential probability ratio test for testing the hypothesis that @ = 4 
against a properly chosen alternative 6, provides a satisfactory solution of our 
problem. 





SEQUENTIAL TESTS 159 


The case in which the alternative values of 6 are restricted to values less than 
% is entirely analogous to that in which the alternatives are restricted to values 
greater than 4 , and need not be discussed separately. 

It should be pointed out that the test procedure for testing 6 = 6 against 
alternatives 0 > 6), as described in this section, is also suitable for testing the 
composite hypothesis that 6 < 6), provided that the probability of rejecting 
the null hypothesis is < a whenever the true value of 6 is < 6). This condi- 
tion is fulfilled, for instance, when X has a normal, binomial or Poisson distribu- 
tion. 

5.2. Application to binomial distributions. 5.2.1. Statement of the problem. 
The case of a binomial distribution arises when the result of a single observa- 
tion is a classification into one of two categories. For example, this is the 
situation in acceptance inspection of manufactured products, if each unit 
inspected is classified into one of the two categories, non-defective and defective. 
Let p denote the probability that an item belongs to a given category. The 
value of p is usually unknown. We shall deal here with the problem of testing 
the hypothesis that p does not exceed a given value p’ against the alternative 
possibility that p > p’. 

Since acceptance inspection of manufactured products is perhaps the most 
important and widest field of application of such a test procedure, we shall, in 
continuing the discussion, use the terminology of acceptance inspection. This, 
of course, does not mean that the test procedure is not applicable to other 
cases. Suppose that a lot containing a large number of units is submitted for 
sampling inspection. Let p denote the proportion of defective units contained 
in the lot. The probability that a unit drawn at random from the lot will be 
defective is equal to p. If m units are drawn at random from the lot, the prob- 
ability that there will be d defectives among them is given by” 


! 
(5.1) ie p(l — p)" (d =0,1,---,m). 
The probability distribution as given in (5.1) is called a binomial distribution. 
The purpose of sampling inspection is to decide whether the lot should be 
accepted or rejected. It is clear that for high values of p we want to reject the 
lot and for low values of p we want to accept the lot. Thus, it will be possible 
to specify a particular value of p, say p’, so that if p < p’ we wish to accept the 
lot, and if p > p’ we wish to reject the lot. Thus, our problem is to devise a 
proper sampling inspection plan for testing the hypothesis that p < p’. 
5.2.2. Tolerated risks for making a wrong decision. No sampling inspection 
plan can guarantee that the correct decision will always be made, i.e., that the 
lot will always be accepted when p < p’ and the lot will always be rejected when 
p > p’, unless the lot is inspected completely. A complete inspection is usually 





18 Formula (5.1) is exact only if the lot contains infinitely many units. While the lot is 
always finite in practice, we shall assume that m is small as compared with the lot size so 
that formula (5.1) can be used. 
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rather uneconomical and one is willing to take some risk of making a wrong 
decision if this permits a reduction in the amount of inspection. Hence, recom- 
mendations as to the proper choice of a sampling inspection plan can be made 
only after the risks that can be tolerated have been stated. 

If p is equal to the marginal value p’, we may say that it is indifferent to us 
whether the lot is accepted or rejected. If p < p’ we prefer acceptance and 
this preference is the stronger the smaller p. Similarly, if p > p’ we prefer 
rejection of the lot and this preference increases as p increases. Thus, it will 
be possible to select a value po < p’ and a value p; > p’ such that the error is 
considered serious only if we accept the lot when p > p,, or we reject the lot 
when p < po. : 

After the two values po and p; have been selected the risks that we are willing 
to tolerate may reasonably be stated as follows: a sampling inspection plan is 
required such that the probability of rejecting the lot is less than or equal to a 
preassigned value a whenever p < po, and the probability of accepting the lot 
is less than or equal to a preassigned value 6 whenever p > p,. Thus, the 
tolerated risks are characterized by the four quantities po, pi, a and 8. The 
proper sampling plan can be determined after these four quantities have been 
chosen. 

5.2.3. The sequential probability ratio test corresponding to the quantities po, 
pi, aand B. Let Ho be the hypothesis that p = po and H, the hypothesis that 
p = pi. Consider the sequential probability ratio test T for testing Ho against 
H, for which a is the probability of accepting H,; when H, is true (error of the 
first kind) and 8 is the probability of accepting H» when H, is true (error of the 
second kind). This probability ratio test will satisfy all our requirements, since 
for this test the probability of accepting the lot (accepting Ho) is <8 whenever 
p = pi and the probability of rejecting the lot (accepting H,) is <a whenever 
pS po. 

According to formulas (3.8), (3.9), (3.10) and section 3.3 the sequential test 
T is given as follows: At each stage of the inspection, at the m-th observation 
for each integral value of m, calculate the quantity 


d m—d 
™ m 1 aia ) m 

(5.2) Pm ., wil = pd 
Pom Po” ( 1- Po) " 





(m = 1, 2,---) 


where d,, denotes the number of defectives found in the first m units inspected. 
Reject the lot (accept H;) if 


(5.3) Pmy 1-8 


Po m a 





Accept the lot if 


(5.4) Pm _B 
Pom l—a 
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Take an additional observation if” 
(5.5) = 
For the purpose of practical computations it is useful to rewrite the inequalities 
(5.3), (5.4) and (5.5) in a somewhat different form. Taking the logarithms of 


both sides of the inequalities (5.3), (5.4) and (5.5) one can easily verify that 
these inequalities are equivalent to 
































log t= B log | — ~ 
(5.6) iF +m - 
log 2? — lo Pi log 2! — Jo e 
£ Po £ 1 Do £- £ io % 
log j log — Po 
(5.7) é.¢——— ae +m Pr 
Pi —_— Pi P1 
oa — — lor ——— log — — lo 
po . Po © po 51 pm 
and 
log o log ae 
sneemenmennnenit peace +m _ o 
log ?} — log - —= log 2 — log ——* 
(5.8) Po Po Po Po , 
log —— B log 5 ie 
es en $A ne : 
Pi = Pi — i 
log — — lo log — — le ——~— 
Bn cit n eo "Tt - * 


Using the inequalities (5.6), (5.7) and (5.8) the test procedure can easily be 
carried out as follows: For each m we compute the acceptance number 











log i E log _ Po 
(5.9) A, = eciecsireet +m emo tear ne 
a ee 
SD *i- ” *i-~a 
and the rejection number 
—_ 
log .~? log — 
(5.10) R,, = eimai +m P = 
Pi Pi Pi ~~ 
log — — lo a ee ee 
i. lees EE 








14 There is a slight approximation involved in the formulas (5.3), (5.4) and (5.5). For 
details see section 3.3. 
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These acceptance numbers A, and rejection numbers R,, are best tabulated 
before inspection starts. Inspection is continued as long as Am < dm < Rn. 
At the first time when d,, does not lie between the acceptance and rejection 
numbers, the sampling inspection is terminated. The lot is accepted if dn < A» 
and the lot is rejected it d,» > Rn. 

The test procedure can also be carried out graphically as indicated in Figure 2. 
The number m of observations made is measured along the abscissa axis. Since 
Am is a linear function of m, the points (m, Am) will lie on a straight line Lp. 
Similarly, the points (m, R,,.) will lie on a straight line L,. We draw the lines 
Ly and L, and the points (m, dm) are plotted as inspection goes on. At the first 
time when the point (m, d») does not.lie between the lines Lo and L, inspection 














tt +——— + hn 
T 8 9 10 tl 12 13 14 15 ™ 


x 
pee ert \o 


Fic. 2 


is terminated. The lot is rejected if the point (m, d,,) lies on L; or above, and the 
lot is accepted if the point (m, d») lies on Lo or below. 

5.2.4. The operating characteristic curve of the test. As mentioned in section 
5.2.3 the test procedure defined by the inequalities (5.6), (5.7) and (5.8) will 
satisfy the requirement that the probability of accepting the lot is < 8 when- 
ever p > p,; and the probability of rejecting the lot is <a whenever p < po. 
Although this already describes the essential features of the test procedure, it 
may be desirable to know the probability L, of accepting the lot for any possible 
value p of the proportion of defectives in the lot. Clearly, L, will be a function 
of p and can be plotted as shown in Figure 3. The curve L, is called the operat- 
ing characteristic curve. The range of p is, of course, from 0 tol. L, = 1 
for p = 0 and L, = 0 for p = 1. The value of L, decreases as p increases. 
We already know that L,, = 1 — aand L,, = 8. Now we shall give a method 
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for computing the value of L, for any p. If pi is not far from po, which will 


usually be the case in practice, a good approximation to L, is given by (see 
equation 3.35) 


(6.11) L,~1-— en eet 
tF-(4) 5-4) 


where h is equal to the non-zero root of the equation 


h h 
(5.12) »(2) + (1 — p) G=*) = 1. 
Po 1 — po 
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To plot the operating characteristic curve, it is not necessary to solve (5.12) 
with respect to h. Instead we can proceed as follows: From (5.12) we express 
p as a function of h, i.e., 


“ha 
(5.13) p= egg lanai 
“Ga 
Po 1 — Po 
For any given value h we compute the value of p from (5.13) and the value of 
L, from (5.11). The point (p, Ly) obtained in this way will be a point of the 
operating characteristic curve. Doing this for various values of h we can 


obtain a sufficient number of points on the operating characteristic curve so 
that the curve can be drawn. 
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5.2.5. The average amount of inspection required by the test. Denote by E,(n) 
the expected value of the number of observations required by the test. Clearly, 
E,(n) is a function of p. According to (4.8) a good approximation to the value 
of E,(n) is given by 

L, log —"— + (1 — L,) log ——* 
(5.14) Ni ie eee : 


1— 
p log? + (1 — p) log ——?* 
Po 1 — po 





0 


where L, is given by (5.11). Plotting #,(n) as a function of p, the curve obtained 
will, in general, be of the type shown in Fig. 4. The maximum will ordinarily 
be reached between pp and p,;. Furthermore, the curve will, in general, be 
increasing as p increases from 0 to po, and decreasing as p increases from 7; 
to 1. 


Ep(n) 


O P5 p | p 
Fic. 4 


5.3. Sequential analysis of double dichotomies. 5.3.1. Formulation of the 
problem. Suppose that we want to compare the effectiveness of two production 
processes where the effectiveness of a production process is measured in terms 
of the proportion of effective units in the sequence produced. We shall say that 
a unit is effective if it has a certain desirable property, for example, if it with- 
stands a certain strain. Let p: be the proportion of effectives if process 1 is 
used, and ps2 the proportion of effectives if process 2 is used. In other words, 
p. is the probability that a unit produced will be effective if process 1 is used, 
and p, is the probability that a unit produced will be effective if process 2 is 
used. Suppose that the manufacturer does not know the values of p: and po, 
and that process 1 is in operation. If p,; > po, then the manufacturer wants to 
retain process 1. However, if pi: < pe, especially if p; is substantially smaller 
than p., the manufacturer would like to replace process 1 by process 2. Thus, 
we are interested in testing the hypothesis that p; > pe against the alternative 
that Pi < Pro. 

A more general formulation of the problem can be given as follows: Consider 
two binomial distributions. Let p, be the probability of a success in a single 
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trial according to the first binomial distribution, and let p2 be the probability 
of a success in a single trial according to the second binomial distribution. 
We shall use the symbol 1 for success and the symbol 0 for failure. Suppose 
that the probabilities p; and p2 are unknown. We consider the problem of test- 
ing the hypothesis that p; > pe on the basis of a sample consisting of N; observa- 
tions from the first binomial distribution and N» observations from the second 
binomial population. Since in many experiments the case N; = Ne is mainly 
of interest, and since this case (as we shall see later) makes an exact and sim- 
plified mathematical treatment of the problem possible, we shall assume in what 
follows that Ni = Ne = N (say). 

Thus, on the basis of the outcome of the two series of N independent trials 
we have to decide whether the hypothesis p; > p2 should be accepted or rejected. 

5.3.2. The classical method. The classical solution of the problem for large NV 
is given as follows: Let S,; be the number of successes in the first set of N trials 
(drawn from the first binomial population), and let S. be the number of suc- 
cesses in the second set of N trials (drawn from the second binomial population). 





Denote a Sp by pand 1 — pby g. Then for large N the expression 
S. — S, 
(5.15) V/2N BG 


is normally distributed with zero mean and unit variance if p; = p2. Suppose 
that the level of significance we wish to choose is a. Let \, be the value for 
which the probability that a normal variate with zero mean and unit variance 
will exceed \, is equal to a. (For example, if a = .05,\, = 1.64). Thus, if 
pi = po, the probability that the expression (5.15) will exceed XA. is equal to a. 
If p: > pe, the probability that the expression (5.15) will exceed \, is less than a. 
According to the classical method the hypothesis that p; > pe is rejected if the 
observed value of (5.15) exceeds X\.. This method involves an approximation. 
The distribution of the expression (5.15) is not exactly normal even for large NV. 
For small N this method cannot be used, since the distribution of (5.15) is far 
from normal. For small N, R. A. Fisher has proposed an exact method which, 
however, involves cumbersome calculations. In section 5.3.3. we shall suggest 
another method which is exact (does not involve any approximations) and is 
simple to apply as far as computations are concerned. The latter method has 
the further advantage of being suitable for sequential analysis to which existing 
methods are not readily adaptable. 

5.3.3. An exact method. Let a,,---, ay be the results in the first set of N 


trials, and b;, --- , by the results in the second set of N trials. These results are 
arranged in the order observed. Consider the sequence of N pairs 
(5.16) (a; ’ bi), oe (ay ’ by). 


Let t; be the number of pairs (1, 0) and ¢, the number of pairs (0, 1) in this 
sequence. We consider only the pairs (0, 1) and (1, 0) and base the test on them. 
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Let a be the outcome of an observation from the first population, and b the 
outsome of an observation from the second population. The probability that 
(a, b) = (1, 0) is equal to pi(1 — pe), and the probability that (a, b) = (0, 1) is 
equal to (1 — pi)po. Hence, knowing that (a, b) is equal to one of the pairs 
(0, 1) and (1, 0), the (conditional) probability that it is equal to (0, 1) is given by 


x (1 — pi)po 
5. ee ee 
_— P pid — pe) + poll — pry’ 


and the (conditional) probability that it is equal to (1, 0) is given by 


ee pi(l — pr) 
a P pill — po) + (1 — Di) Po 


Hence, considering only the pairs (1, 0) and (0, 1) the variate ¢2 is distributed like 
the number of successes in a sequence of ¢ = t; + t. independent trials, the prob- 
ability of a success in a single trial being equal to p. One can easily verify that 
p = tif pi = po, p < if pi > poand p > Zif pi < po. Thus, the hypothesis 
to be tested, i.e., the hypothesis that p,: > po, is equivalent to the hypothesis 
that p < 34. Thus, we can test the hypothesis that p: > pe by testing the 
hypothesis that p < 3 on the basis of the observed value of t. Since the dis- 
tribution of t. is the same as the distribution of the number of successes in t = t; + 
ts independent trials (¢ is treated as a constant and the probability of a success 
in a single trial is equal to p), the test procedure can be carried out in the usual 
manner. If we want a level of significance a, a critical value T is chosen so that 
for p = 3 the probability that tj > T is equal toa. The hypothesis that p < 3 
is rejected if and only if the observed f: is greater than or equal to the critical 
value T. The value of T can be obtained from a table of the binomial distribu- 
tion. If tis large, t2 is nearly normally distributed and the critical value T can 
be obtained from a table of the normal distribution. 

This procedure thus provides a simple test of the hypothesis that pi > pe. 
The question arises whether the efficiency of this method is as high as that of the 
classical method. It would seem that the method suggested here cannot be a 
most efficient procedure, since the values of ¢, and ¢. depend on the order of the 
elements in the sequences (a;,---, adv) and (b;,---, by), and there is no 
particular reason to arrange them in the order observed. However, it has been 
shown in [7] that the loss in efficiency as compared with the classical method is 
negligible if the number N of trials is large.” 

It should be pointed out that the procedure for testing the hypothesis that 
pi > p2 can be used also for testing the hypothesis that pi = pe if the alternative 
hypotheses are restricted to p, > p.. 

In addition to simplicity and exactness the present method seems superior to 
the classical one in the following respect: Suppose that (contrary to the original 

15 The author believes that the loss in efficiency is slight even when N is small, although 
no exact investigation of this case has been made. 
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assumption) the probability of a success varies from trial to trial. Denote by 
pi” the probability of success in the i-th trial of the first set, and by pS’ the prob- 
ability of success in the 7-th trial in the second set (¢ = 1, --- ,N). Assume that 
that the probabilities pj” and p” are entirely unknown and we wish to test the 
hypothesis that p{” — ps? = --- = pf” — p$” = 0. In this case the classical 
method is not applicable, but the present method provides a correct procedure. 
Such a situation may arise, for instance, if we want to test the hypothesis that 
the probability of a success (hitting the target) is the same for two different guns. 
In the course of the experiments the probability of a hit may change due to ex- 
ternal conditions such as wind, disposition of the gunner, etc. However, these 
external conditions are likely to affect both guns equally if the trials are made | 
alternately (or approximately alternately), so that if the two guns are equally 
good we have p;” = ps” (i = 1,--- , N). 

5.3.4. Sequential test of the hypothesis that p; > p2. In order to devise a proper 
sequential test for testing the hypothesis that p1 > po, we have to state first 
what risks of making wrong decisions we are willing to tolerate. The efficiency 
of the production process 1 may be measured by the ratio of effectives to in- 








effectives produced, i.e., by ki = i Pi mt Production process 1 may be regarded 
a 1 

the more efficient the larger the value of k,. Similarly, the efficiency of produc- 

tion process 2 may be measured by kz. = i Pe “ The relative superiority of 


production process 2 over the process 1 can then reasonably be measured by the 
ratio of ke to k, i.e., by 


oy = Me Pol — pi) 
(5.19) wee a 


If uw = 1, the two processes are equally good. If «wu > 1, process 2 is superior to 
process 1, and if w < 1, process 1 is superior to process 2. Thus, the manu- 
facturer will, in general, be able to select two values of wu, wo and uw say (wo < uw) 
such that the rejection of process 1 in favor of process 2 is considered an error of 
practical importance whenever the true value of wu < wo, and the maintainance 
of process 1 is considered an error of practical importance whenever u > wm. 
If u lies between up and u,, the manufacturer does not care particularly which 
decision is taken. 

Clearly, we will always have uw < u,. If the transition from production 
process 1 to process 2 involves some cost or other inconveniences, it seems 
reasonable to put uw = 1 (or wo may even be slightly greater than one). This 
choice of uo really means that we consider the rejection of process 1 a serious error 
whenever this process is not inferior to process 2. On the other hand, if the 
transition from process 1 to process 2 does not involve any inconveniences, the 
rejection of process 1 in favor of 2 cannot be a serious error when the two processes 
are equally efficient, i.e., when wu = 1. Thus, in such a case, it seems reasonable 
to choose uw somewhat below 1. 
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After the quantities uw) and u; have been chosen the risks that we are willing 
to tolerate may reasonably be expressed in the following form: The probability 
of rejecting process 1 should not exceed a preassigned value a whenever u < wu, 
and the probability of maintaining process 1 should not exceed a preassigned 
value B whenever u > wu. 

Thus, the risks that we are willing to tolerate are characterized by the four 
quantities uo, u,, a and B. After these four quantities have been chosen, a 
proper sequential test can be carried out as follows: The (conditional) prob- 
ability that we obtain a pair (0, 1), as given in (5.17), can be expressed as a func- 
tion of u. In fact 


ai ~- D1) P2 
sea (1 a Pr) pe pill — pe) U 
5.20 = - - 
— © pi(l — po) + po(l — pr) po(1 — pr) li+u 


, T ol — pe) 


Uo 


1+ Ww 


A proper sequential test satisfying our requirements concerning 


Let Ho denote the hypothesis that p = , and H, the hypothesis that 


— U1 
np 1+ uw 
tolerated risks is the sequential probability ratio test of Ho against H,. The 
acceptance and rejection numbers - this oe test can be obtained from 


(5.9) and (5.10) by substituting ——— oe em ee - for Pits oa 


Thus, for each value of ¢ the ett number is vaieate by 


B 1+ uy 
7s . “ise. 


for pi and t = t; + t: for m. 


(5.21) A, = 


log uy — log uw log u: — log uo 


and the rejection number is given by 


oe | — 8B on 


(5.22) Pe gic en iE greg 

log u; — log uw log uw, — log uo 
These acceptance numbers A; and rejection numbers R,(¢ = 1, 2, --- ) are best 
tabulated before experimentation starts. The sequential test is then carried out 
as follows: The observations are taken in pairs where each pair consists of an 
observation from the first process and an observation from the second process. 
We continue taking pairs as long as A; < tg < R,. At the first time when f, 
does not lie between the acceptance and rejection numbers, experimentation is 
terminated. Process 1 is maintained if at this final stage tf < A, , and process 1 
is rejected in favor of 2 if ft, > R,. 

The test procedure can also be carried out graphically as shown in Figure 5. 
The total number m of pairs (0, 1) and (1, 0) is measured along the horizontal 
axis. The points (¢, A,) will lie on a straight line Lo, since A, is a linear function 
of t. The points (¢, R,) will lie on a parallel line L;. We draw the lines Lp and 
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L, and plot the points (t, t2) as experimentation goes on. At the first time when 
the point (t, t2) is not within the lines Lo and L, experimentation is terminated. 
Process 1 is maintained if at the final stage the point (tf, fe) lies on Ly or below, 
and process 1 is rejected if the point (t, tz) lies on L; or above. 

5.3.5. The operating characteristic curve of the test. For any value u of the ratio 


* we shall denote by L, the probability of maintaining process 1. Clearly, Lu 
1 


is a function of u. This function L, is called the operating characteristic curve 
of the test. The operating characteristic curve can be determined from the 








equations (5.11) and (5.13) by substituting i = i. for p; and i °__ for po. 
1 0 





ee eee ee 


aa 


Fie. 5 


These equations are: 








2 
(5.23) ly ~ (ay — ay i 

a l-e 
and 

1 -(j + “\ 
(5.24) : 3 











1+ u = (aa + wo) ( + “)" 
Uo (1 + us) 1+ w 
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For any given value h we compute the values of wu and L,, from these equations. 
The point (u, L,) obtained in this way will be a point of the operating character- 
istic curve. Calculating the points (u, LZ.) for a sufficiently large number of 
values of h we can draw the operating characteristic curve. 

5.3.6. The average amount of inspection required by the test. For any value u 


‘ kee ‘ . ° 
of the ratio i denote by E,,(t) the expected value of the total number of pairs 





v1 
(0, 1) and (1, 0) required by the test. The value of Z,,(¢) can be obtained from 
uU 
(5.14) by substituting Z,,(¢) for Z,(n), L. for Lp, es for p, and aa — for 
Po. Thus 
L. log or + (1 — L,) log 1-86 
r BBN ee, _ a 
3.35) Eu (i) u m1 + Uo) 1 ‘1 + Uy 


i + 9 u log F w(1 + U1) ~ £44 log 1+ wu 

To compute the expected value of the total number of pairs (including also 
the pairs (0, 0) and (1, 1)), we merely have to divide the right side expression in 
(5.25) by pi(l — pe) + po(l — pr). 

In the rare event that no decision is yet reached at a number of pairs equal to 
three times the expected value, we can truncate the test at that stage without 
seriously affecting the probabilities of making a wrong decision (see section 4.6 
in Part I). 

5.3.7. Observations made in groups of r. In applications it may happen that at 
each stage in the sequential process instead of drawing a single observation we 
draw r observations from each of the binomial distributions. Hence, instead of 
a single pair, we have two sets of r observations. If the order of observations 
in each such set of r is recorded, we can establish the number of pairs (0, 1) and 
the number of pairs (1, 0) for each pair of sets of r observations. In such a case 
the test can be carried out as described in section 5.3.4, since after each pair of 
sets of 7 observations we can compute ¢ and f2. The only effect of taking the 
observations in groups of r is that more observations will generally be necessary 
(approximately enough to fill out a group) and thereby the probability of making 
an incorrect decision will be made somewhat smaller. However, if the order of 
observations in such groups of r is not recorded, the difficulty arises that we are 
not able to determine the values of t-and f2 needed for the test procedure. It has 
been shown in [7] that in such a case we may replace ¢ and ft. by certain estimates 
of ¢ and ¢t, without affecting seriously the probability of making an incorrect 
decision. The estimates of ¢; and f. (and thereby also an estimate of t = t; + ¢2) 
are obtained as follows: Let 7; be the number of successes in the group of r ob- 
servations drawn from the first binomial distribution, and let re be the number 
of successes in the group of r observations drawn from the second binomial distri- 
bution. Then for this pair of groups of r observations, we estimate the number 
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. Tyo ‘ ‘ Tle - 
of pairs (1, 0) to be 7; — ~~ and the number of pairs (0, 1) tobe r2 — 7. Thus, 
r r 
° é ‘ ° TyY2 ° » 
an estimate of ¢; is obtained by summing 7; — — over all pairs of groups ob- 
r 


. ° ° ° 12 ° 

served, and that of f. is obtained by summing r. — — over all pairs of groups 
r 

observed. 


5.4. Application to testing the mean of a normal distribution with known stand- 
ard deviation. 5.4.1. Formulation of the problem. Suppose that a measurable 
quantity 2 is normally distributed with unknown mean 6 and known standard 
deviation ¢. For example, x may be some measurable quality characteristic 
of a unit of a certain product where z is normally distributed with a known 
standard deviation in the population of all units. The problem we shall con- 
sider here is to test the hypothesis that the unknown mean 6 is less than a specified 
value 6’. This problem arises frequently, for example, in quality control. 
Suppose that the quality of the product is considered the better the higher the 
mean value of x. Thus, there will be a value 6’ such that the product is con- 
sidered sub-standard if @ < 6’ and the product is considered to meet specifications 
if @ > 6’. Since @is unknown, we are usually interested in testing the hypothesis 
that 0 < 6’,i.e., that the product is sub-standard. 

Since quality control is an important field of application for such test proce- 
dures, the discussion will be continued in the terminology of quality control. 
This, of course, should not be interpreted as a restriction upon the general 
validity and applicability of the test procedure. The problem treated in section 
5.4 can now be stated as follows: Let x be a measurable quality characteristic 
of a unit of a certain product. The variable x is supposed to be normally 
distributed with known standard deviation in the population of all units pro- 
duced. The problem is to devise a sampling plan for testing the hypothesis 
that the product is sub-standard. The product is said to be sub-standard, if 
the mean @ of z is less than a given specified value 6’. 

5.4.2. Tolerated risks for making a wrong decision. No sampling plan can 
guarantee that the correct decision will always be made, i.e., that the product 
will be declared sub-standard if and only if @ < 6’. The larger the amount of 
inspection, the smaller we can make the risks for making a wrong decision. If 
inspection is costly, or destructive, we are willing to tolerate some risks of making 
wrong decisions in order to reduce the necessary amount of inspection. Thus, 
a proper sampling plan can be recommended only after the risks that can be 
tolerated have been stated. 

If the quality of the product is exactly on the margin, i.e., if @ = 6’, then it 
will make little difference whether the product is classified as sub-standard or 
not. However, if @ is considerably smaller than 6’, then the acceptance of the 
hypothesis that the product meets specifications (rejection of the hypothesis 
that the product is sub-standard) will usually be considered as a serious error. 
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Similarly, if @ is much larger than 6’, the acceptance of the hypothesis that the 
product is sub-standard will generally be considered as a serious error. Thus, 
the manufacturer will, in general, be able to select two values of 6, 4) and 6; say 
(0) < 6’ and 6 > 6’) such that the classification of the product as satisfactory 
(meeting specifications) is considered an error of practical importance whenever 
6 < @, and the classification of the product as sub-standard is considered an 
error of practical importance whenever @ > 6,. If 6 lies between 4 and 41, a 
wrong classification of the product will not be viewed as a serious error, since 
in this case @ is near the marginal value 6’. 

After the two values 6) and 6; have been selected, the risks that we are willing 
to tolerate can be stated in the following form: A sampling plan is required 
such that the probability of classifying the product as satisfactory is less than 
or equal to a preassigned quantity a whenever 6 < 6), and such that the prob- 
ability of classifying the product as sub-standard is less than or equal to a 
preassigned quantity 8 whenever 6 > 6,. Thus, the tolerated risks are char- 
acterized by the four quantities 0, 6:, a and 6. A proper sampling plan can 
be devised after these four quantities have been selected. 

5.4.3. A sequential test of the hypothesis that 0 < 6’ (the product is sub-standard). 
Let Ho be the hypothesis that @ = 4) and let H; be the hypothesis that 6 = 6,. 
Let 7' be the sequential probability ratio test for testing Ho against H; such that 
a is the probability of accepting H, when Hp is true and 8 is the probability of 
accepting Hy when H, is true. This sequential test will satisfy all our require- 
ments, since for this test the probability of accepting H» (declaring the product 
as sub-standard) is < 8 whenever 6 > 6, and the probability of accepting H; 
(declaring the product as satisfactory) is < a whenever @ < 4. 

The sequential test 7 is given as follows: Denote the successive observations 
on « by #1, %2,---,ete. Accept the hypothesis that the product is satisfactory 
at the m-th observation if 


™m 
—(1/202) S (r—g—6;)? 


e a=1 1 om B 
(5.26) log —_——— > log ——.. 
= 9 a 
—(1/202) D (za—89)? 
ce a=1 


Accept the hypothesis that the product is sub-standard if 


m 


—(1/202) SS (zq—61)? 


e a=1 8 
(5.27) log —————_ < log ——_ 
m l—a 
—(1/202) BS (xr—g—69)? 
c a=l1 
Take an additional observation if 
m 
—(1/202) S& (xa—61)? 
é 1 1 nas 8 
(5.28) log — rx < log ————__ < log ——__.. 
l —— a m a 


—(1/%e2) S (2.—09)? 
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The inequalities (5.26), (5.27) and (5.28) are equivalent to 


= 7 1-8 O + 6 
5.29 i SP census log ———" Se ate 
( ) _ La. = 6 — b og 2 +m 9 

m 2 6 + 6 
5.30 te ta ho + 41 
( ) ig * Se ee + 5 
and 

ie O + & 

; cs lo — +m 5 
(5.31) 


m 


< ys La < ean log [-—# +n 
| A: — & a 


’ 9 + A 
> > 
respectively. 
Using the inequalities (5.29), (5.30) and (5.31) the test procedure can easily 
be carried out as follows: For each m compute the acceptance number 


o B 0 + 6; 
1? liad 
i-<4i-at oS 


and the rejection number 


(5.32) A > 


1 2 ie .=—s Oo + 
(5.33) Rk, = a log : +m —: 


These acceptance numbers A,, and rejection numbers F,, are best tabulated 


™m 


before inspection starts. Inspection is continued as long as Ay», < fa < 
a=1 
m 


R,,. At the first time that 2. Xq does not lie between A,, and R,, , inspection 
a=l 


is terminated. If at this final stage >> x. < A, , the hypothesis that the 
a=l 


product is sub-standard is accepted, and if >> ta > Ry», the hypothesis that 


a=1 
the product is sub-standard is rejected. 

The test procedure can also be carried out graphically as shown in Figure 6. 
The number m of observations is measured along the horizontal axis. The 
points (m, A,,) will lie in a straight line Ly and the points (m, R,,) will lie on a 
parallel line L;. We draw the parallel lines Ly) and L; and plot the points 

m m 
(m, zz re) as inspection goes on. At the first time when the point (m, Zz re) 
a=l1 a—1l 
does not lie between the lines Ly and L; inspection is terminated. The hypothe- 
™m 
sis that the product is sub-standard is rejected if the point (, > re) lies on Ly 


a—l 


m 
or above. The hypothesis in question is accepted if the point (m, >> r») 


a=] 


lies on Ly or below. 
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5.4.4. The operating characteristic curve of the test. For any value @ denote by 
[4 the probability that the hypothesis that the product is sub-standard is 
accepted. Obviously, L, will be a function of @ and is called the operating 
characteristic curve of the test. The shape of the operating characteristic curve 
will, in general, be of the type shown in Figure 7. Ly approaches 1 as @ > — 
and Ly approaches zero as 6 —> «x. Furthermore, Lg is a decreasing function 
of 6. We already know the values of Le for 6 = 6 and @ = 6,. Now we shall 


: : ; . 9, 
give a method for computing the value of Ly for any 6. If - 


i 
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which will usually be the case in practice, a gocd approximation to Lg is given 
by (see equation 3.35) 


(re) CA) 
(5.34) Iy~1 — ~— gas ane at olan 
. (a) CE) Oe) -() 


where the constant h is determined as follows: First we compute the character- | 
istic function g(t) of the variate 


= % +. fairly small, 
| 





1 
— 553 °F 6,)2 


(5.35) z = log “— : 


= 55 [2(1 — Oo) + 05 — oil. 





——, (r—89) 2 


> \ 


@ 202 
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0 — 8 , (0; — O%)0 








Thus, z is normally distributed with mean = oa2 pay and variance = 
o~ o~ 
(1 — 6) 4 i 
gene Consequently, ¢(/) is given by 


00-81, (1—00)87, , (01-80)? 9 
(5.36) ¢(t) = é aoe tae a 
The value h is the non-zero real root of the equation g(/) = 1. Hence 

sa alee ft Lhe Ke =a: 
(5.37) ,-4- 8 —- ~~ OP 8 +8 ~- 2 

(0; = 4) 6; —_ A 

The operating characteristic curve can be computed from (5.34) substituting 
the right hand side member of (5.37) for h. 

5.4.5. The average amount of inspection required by the test. Let E,(n) denote 
the expected value of the number of observations required by the test when 6 

















is the true mean of x. According to (4.8) a good approximation to the value of 
E,(n) is given by 
B .~-s 
, 2 Ly log —— + (1 = Le) log - - 
Ein) ~ af "Te ”" ss 
6) — 0; + 2(6; — %)0 
where L, is given by (5.34). 

In the rare event that the number of observations reaches three times the 
expected value before the test is terminated, we can truncate the test at this 
stage without seriously affecting the probabilities of making a wrong decision. 
(See section 4.6 in Part I). 


6. Outline of a General Theory of Sequential Tests of Hypotheses when No 
Restrictions Are Imposed on the Alternative Values of the Unknown 
Parameters 


6.1. Sequential test of a simple hypothesis with no restrictions on the alternative 
values of the unknown parameters. Consider the following general case. Let 
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X,, °°: ,X, be a set of p random variables and let f(a, ++: ,%p, 01, °°* , %) 
be the joint probability density function of these random variables involving k 
unknown parameters 6,,--- , 6.. Suppose that we wish to test the hypothesis 
Hy that 6; = 6, --- , 0 = 0, where 6}, --- , 6; are some given specified values. 
Denote the set of all a priori possible parameter points by 2. Assume that 2 
contains at least a finite k-dimensional sphere with the center (61, --- , 6%). 
Let Q* be the set of all possible alternative parameter points; i.e., Q* is the 
whole parameter space 2 with the exception of the point 6° = (6), --- , 64). 
For any statistical procedure for testing Hy, the probability of an error of the 
first kind, will have a definite value, but the probability of an error of the second 
kind will depend on the true alternative; i.e., it will be a single valued function 
B(6) defined over all points @ of Q*. Let w(@) be some non-negative function, 
called weight function, such that , w(@) d@ = 1. Suppose that we wish to 
construct a sequential test such that the probability of an error of the first kind 
is equal to a given a@ and that the weighted average | 


oO 


w(0)8(@) d(6) of the 


probabilities of errors of the second kind is equal to some given positive value 8. 
This problem can easily be solved as follows: Let po, be equal to the product 


n 
0 ( : 
II flrie, +++, tpa, 0, °°, 0) where zig denotes the ath observation on 
a=! 
4: =1,---,p;a=1,---,n). Furthermore, let p:, be defined by 


(6.1) Pin = / w(6) | I i ca i ae i ig 0,) | d6. 
o a=1 J 

The expression p;, can be interpreted as the probability density in the sample 

space of n observations on the variates 71, ---,2,, if we assume that the 

parameter point @ in Q* has a probability distribution given by the density 

function w(6) dé. 

We shall denote by H, the hypothesis that the probability density function 
in the sample space of n observations on X,, --- , X, is given by pi, defined in 
equation (6.1). The problem of testing 1) against the single alternative H, 
is not exactly of the type discussed in Part I, since pi, given in (6.1) cannot be 
represented, in general, as a product of n factors where the ath factor depends 
only on the observations %i2,-°°*,2pa2.- However, it was pointed out in sec- 
tion 3.2 that the fundamental inequalities derived in Section 3.2 remain valid 
also when 7p,, is given by an expression of the type (6.1). Thus, we can use the 
sequential probability ratio test for testing Ho against the single alternative H,. 
We reject Hp if 


(6.2) > 4. 
Pon 


we accept Ho if 





(6.3) sn 


lA 
& 


Pon 
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~I 


f 


—e 


and we make an additional observation 





(6.4) B<P <A. 
Pon 
The expression pi, is given by (6.1) and the constants A and B are chosen so 
that the probability of accepting H,; when Hp is true is a, and the probability 
of accepting Ho when H, is true is 8. Thus, for practical purposes we may put 
nd 
A=——"andB= —§. 
a l—a 

Using the sequential process defined by the inequalities (6.2), (6.3), and (6.4) 

we obviously have 


(6.5) l. w(0) 8(0)d9 = B 


where for each point 6 in 2*, 8(@) denotes the probability of accepting H» under 
the assumption that @ is the true parameter point. 

Thus, the sequential test given by (6.2), (6.3), and (6.4) provides a satisfactory 
solution of the problem if we want a test procedure such that the probability of 


an error of the first kind is a and the weighted average / w(6)8(@) dé of the 


probabilities of errors of the second kind is 8. Practical problems, however, 
do not always take this form. Many instances require a test procedure such 
that 6(6) should be less than or equal to a given positive value 6 for all parameter 
points @ whose “distance” (defined in some sense) from @° is greater than or 
equal to some given positive value dy. The “‘distance’’ of two parameter points 
6‘ and 6° may be defined by some function 5(6', 6°) which is equal to zero if 6° = 6° 
and is greater than zero if 6 ~ 6°. Furthermore, for any three points 6’, 6°, &° 
we have 6(6', 6°) = 6(6°, 6’) and &(6', &) + 8(6°, 6) > 6(6, 6°). The distance 
function will, in general, be chosen according to practical needs and mathe- 
matical convenience. 

Given the distance function 6(6', 6°) and given the requirements that the 
probability of an error of the first kind be a and the probability of an error of 
the second kind should not exceed 6 whenever the distance of the true parameter 
point from @° is greater than or equal to dy, the aim is, of course, to construct 
a sequential test which satisfies these requirements with a minimum expected 
number of observations. 

While an exact solution of this problem has not yet been found, the following 
approach seems reasonable: Let % be the set of all parameter points 6 for which 
5(6°, 0) > dy. We restrict ourselves to the class C's of sequential tests based on 
the ratio? where 


Pon 


(6.6) Pon = II f(r, "8%, Xpa, 6, ++, OL), 
a=l 


(6.7) Pin =| w() TI] f (aie, +++, tra, 1, °°*, %) dO 
Q a=1 


0 
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and w(@) may be any non-negative function of 6, called weight function, for 
which 


(6.8) [ w(0)d0 = 1. 


For carrying out the sequential test two constants A and B are chosen. The 


hypothesis Ho is accepted if — < B, Ho is rejected if > > A, and an additional 


Pon Pon 
Pin 
Pon 
tial tests is suggested by the fact that we are led to these tests if it is required 
that some weighted average of the probabilities of errors of the second kind be 
equal to a given value 8. 

Accepting the restriction that the sequential test should be a member of the 
class C's , we still need a principle for choosing the weight function w(@). It is 
clear that the maximum of 6(@) in % depends on the quantities A, B, and the 
weight function w(@). Denote this maximum value by 8y,x[A, B, w(@)]. Since 
it is desirable to make Bmax[A, B, w(6)] as small as possible, it is proposed to 
determine w(@) so that the expression Byax{[A, B, w(@)] becomes a minimum with 
respect to w(@). Since for given values A and B the value of the weighted 


observation is made if B < < A. The restriction to the class C's of sequen- 


average / w(6)8(6) dé is practically independent of w(@) (it is nearly equal 
2 


+0 
B(A — ee . : . - 
to aaa =F). minimizing Byax|4A, B, w(@)] is practically equivalent to mini- 


mizing the difference Byax[A, B, w(@)] - / w(0)8(@) dé. For convenience we 
Qo 


determine w(@) so that Byrax[A, B, w(6)] — | w(6)8(@) d@ becomes a minimum. 
Qo 

For this weight function the maximum of 8(6) in Q) will depend only on A and B. 
Denote this value by 6(A, B). Finally we determine the values A and B so 
that 6(A, B) = 6 and the probability of an error of the first kind becomes a. 

The determination of w(@) is a problem in the calculus of variations. In 
some important cases, however, the solution can be obtained by the following 
simple procedure: Let S(d) be the set of all parameter points @ for which 


6(6°, 0) = d. Let v(@) be a non-negative weight function defined over the 
surface S(do) so that the surface integral v(6) dw = 1 (where dw de- 


S(dg) 
notes the infinitesimal surface element). Consider the following sequential 
procedure: Reject Ho if 


/ v(@) | I f (X12 » °'*, Upay A, noe i a | dw 
S(do) a 


os 0 ar ap 
II fe, "8% Une, 1, ++, Of) 
a 





(6.9) 


is greater than or equal to A, accept Ho if (6.9) is less than or equal to B, and 
make an additional observation if the value of (6.9) lies between A and B. The 
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constants A and B are so chosen that the probability of an error of the first kind 
is a and . B(@)v(@) dw = 8. In many statistical problems it is possible to 
S(dg) 

find a weight function v(@) such that for a conveniently chosen distance function 
5(6', 6) the probability 8(@) of an error of the second kind becomes constant on 
the surface S(d) for any value d, and, furthermore, 8(@) decreases with increasing 
d. For such a weight function v(@), the sequential test based on (6.9), will 
provide a solution of the problem. In fact, the weight function v(@) over the 
surface S(do) can be considered a limiting case of a weight function w(@) defined 
in 2) which takes the value zero for any @ whose distance from @° is greater than 
dy + A with A approaching zero in the limit. For the weight function v(@) the 
maximum of 8(6) in Q is equal to the weighted integral of 8(6). Thus, for this 
weight function the difference between the maximum of 6(@) and the weighted 
integral of 8(@) is minimized. 

We shall illustrate this procedure by a simple example. Let Xi, --- , Xx 
be k normally and independently distributed variates with unit variances. The 
mean values 6,, --- , 6 are unknown. Suppose that it is required to test the 
hypothesis Hy that 6, = --- = 6, = 0. Assume that the distance of two points 
6 and @° is equal to 





+ V(6i — 61)? + +++ + GL — 63)°. 


Then S(d) is a sphere with center at the origin and radius d. Let v(@) be con- 
stant on S(do) and equal to the reciprocal of the area of S(do). We shall show 
that for this weight function v(@), B(@) is constant on the sphere S(d) and is 
monotonically decreasing with increasing d. For this purpose we prove first 
that (6.9) is a monotonically increasing function of Rot+-::+- + #& where Z; 
is the arithmetic mean of the observations on x;. In fact, the expression (6.9) 
becomes 


1 1 k n . 


(6.10) i io 4 


Garon XP [— FEZ 2%] 


= c exp [— 3 ndi] / exp [n=i;6;| dw 
S(do) 


where c; is the reciprocal of the area of S(do) and Z; is the arithmetic mean of 


the n observations tia (a = 1, --- , n). Let r, denote | VF | and let 
a(@) (0 < a < x) denote the angle between the vector (71, --- , %) and the 


vector (6,,---, 6). Then (6.10) can be written 
(6.11) c. exp [—3 nd] [ exp (nrzdo cos [a(6)])dw. 
S(do) 


Because of the symmetry of the sphere, the value of (6.11) will not be changed 
if we substitute y(@) for a(@) where y(@) (0 < y(@) < 7) denotes the angle 





- ~~ 
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between the vector 6 and an arbitrarily chosen fixed vector u. From this it 
follows that the value of (6.11) depends only on r-,. 


Now we shall show that (6.11) is a strictly increasing function of r,. For this 
purpose we have merely to show that 


(6.12) I(rz) = / exp (nrzdy cos [y(@)})dw 
S(do) 


is a strictly increasing function of r,. We have 


(6.13) dl (rz) = / ndy cos [y(8)] exp (nrzdo cos [y(6)])dw. 
dr, S(do) 





Denote by w, the subset of S(do) in which 0 < y(@) < oe and by we the subset 


in which ‘ < 7(@) < mz. Because of the symmetry of the sphere we have 


/ ndy cos [y(@)] exp (nrzdo cos [y(0)]) dw 





we 
os / ndy cos [xr — ¥(@)] exp (nrzd) cos [x — y(0)]) dw 
w1 
on -[ ndy cos [y(6)] exp (—nrzdo cos [y(@)]) dw. 
wy 
Hence 
ere = ndo | cos [y(6)] 
(6.14) dr, 1 


fexp (ndorz cos [y(@)]) — exp (—ndorz cos [y(@)])} dw 


The right hand side of (6.14) is positive. Hence, we have proved that expres- 
sion (6.11) (or (6.10)) is a strictly increasing function of r,. 
To show that 8(@) is constant on S(d) and is monotonically decreasing with 





increasing d, let y1, --- , yx be an orthogonal linear transformation of x1, --- , 2% 
so that Ey) = V6 +--+ &, Ely) = 0 i = 2,---,h). Since gi + 
--) + gp = FH +--+ + Hand since (6.11) depends only on # + --- + &, 
it is seen that the sequence of expression (6.11) formed for any sequence of 
integers n has a joint distribution which depends only on V6; + --- + 6. 


Hence 6(@) is constant on any sphere with center at the origin. Since (6.11) 
is a strictly increasing function of r, , it can be shown that 8(6) is a monotonically 





decreasing function of Vv 6 +++ + 6. Hence, we can test the hypothesis 
H, by the sequential process based on (6.10). 
If k = 1—+that is, if we test the mean value of a single normal variate—the 
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sphere S(d) is a 0-dimensional sphere consisting of the two points 6; = +d and 





6, = —d and expression (6.10) reduces to 

1 1 

2 (Qn)? {exp [—424(Xa — do)’ + exp [—424(te + dy)"}} 
ee 
( ) (2n)n?2 exp [— 3224] 


= 3 exp [—3ndjl {exp [nZdo] + exp [—nZd)]}. 


6.2. Sequential test of a composite hypothesis. We shall give only a brief 
outline of the principles on which a sequential test of a composite hypothesis 
can be based, since they are analogous to those for a simple hypothesis. Let 
X,,°-+-, X, be a set of p random variables and let f(a, --- , %p, 01, °°* , %) 
be the joint probability density function of these variables involving k unknown 
parameters 6,,--- , 4. Denote the set of all possible parameter points @ = 
(@:,---, 6.) by 2. Suppose that we wish to test the hypothesis Ho that the 
true parameter point @ is contained in the subset w of 2. Let @ be the set of 
all points of 2 which are not contained in w. Furthermore, let wo(6) and w;(@) 
be two non-negative functions of 6, called weight functions, such that 


(6.16) [ w(0)dé = 1 and [ w,(0)d@ = 1. 


If w is a surface in the space © then the integral over w is meant to be the surface 
integral over w. 

In testing a composite hypothesis the probability of an error of the first kind 
need not necessarily be the same for all points @in w. It will, in general, be a 
function a(@) of the true point 6 in w. Similarly the probability of an error of 
the second kind is a function 6(6) of 6 defined for all points in. Suppose that 
we wish to construct a sequential test such that the weighted average 


/ w(0)a(@) dé of the probabilities of errors of the first kind is a given value 


a, and the weighted average [ w(6)8(6) dé of the probabilities of errors of 


the second kind is a given value 8. Then the following sequential test can be 
used: Denote by Ho the hypothesis that the probability density in the sample 
space of n observations on X,, --- , X, is given by 


(6.17) Pon = [ wo(6){ I] f(@ia, "°°, Lpa, A, ae Ox) ] dé6 
and by H : the hypothesis that the density in the sample space is given by 
(6.18) Pa = [ wi()[I] f(@r0, ***  Bpay 61, ict 6.) de. 


The sequential probability ratio test for testing H 0 against the single alternative 
Hf provides a solution of our problem. If the constants A and B in this sequen- 
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tial test are chosen so that the probability is a that we reject H 0 when Ho is 
true, and the probability is 8 that we accept H 0 when H; is true, then for this 
sequential test we have 


[ wo(0)ee(0) dé 


I 
R 


and 


II 


[ w.@8@ a0 = 6. 


This can be proved in the same way as the corresponding statement in the case 
of a simple hypothesis. 

Frequently we may require a sequential test procedure such that the least 
upper bound of a(@) in w is equal to a given a and (6) is less than or equal to a 
given £ for all points 6 whose ‘‘distance’”’ (defined in some sense) from w is greater 
than or equal to a given positive value d). The “distance” of a parameter 
point @ from w may be defined by some function 6(@, w) which is positive if 0 
is not in w and is zero if @isinw. The distance function will be chosen in general 
according to practical needs and mathematical convenience. For reasons simi- 
lar to those discussed in the case of a simple hypothesis, an appropriate sequential 
test procedure with the desired properties can be found as follows: Let @(d) 
be the set of all points @ for which 6(@, w) > d. Let, furthermore, w)(6) and 
w,(0) be two weight functions such that 


(6.19) [ wo(o) do =f r0x(0) do = 1. 
w @ (dg) 
Denote by H> the hypothesis that the probability density in the sample space 
of n observations on Xi, --- , Xp is given by 
(6.20) Pon = / Wo(4) [TI sie, "°*, Upe, | dé (n = 1, 2, or -) 
and by Hj the hypothesis that the probability density in the sample space of n 
observations on X,,---, X, is given by 
(6.21) Pin _ [. , w1(8) | T1 Jin, °¢°, Lpea, | dé. (n = 1, 2, -++) 
@(do a= 


Consider the sequential probability ratio test for testing the simple hypothesis 
H¢ against the single alternative HY. For any 6 in w let a(@) be the prob- 
ability of accepting HT when @ is true, and for any @ in & let 6(@) be the prob- 
ability of accepting H 9 when @ is true. It is clear that a(@) and B(@) depend on 
the constants 4A and B used in the sequential process and on the weight functions 
wo(@) and w,(@). For given A, B, wo(@) and w,(6) let B[A, B, wo(@), w:(@)] be the 
least upper bound of 8(@) in @(do) and let a[A, B, wo(@), wi(@)] be the least upper 
bound of a(@) in w. Consider the difference 


AalA, B, wo(0), wi(8)] = alA, B, (0), 21(0)] — | ro(@)a(o) ao 
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and 


ABLA, B, ws(6), w,(6)| = BLA, B, wo(A), w(6)] — | w,(0)3(0) dé. 
@(dg) 
Determine wo(@) and w;(@) so that Max [Aa, Ag] is a minimum. For these 
weight functions the least upper bound of a(@) in w and the least upper bound of 
8(8) in (do) will be functions of A and B only. Finally, we determine A and B 
so that the least upper bound of a(@) in w becomes a, and the least upper bound 
of B(@) in @(do) becomes B. 

The determination of wo(@) and w,(@) involves the solution of problems in 
the calculus of variations. However, in some important cases the solution of 
the problem can easily be derived, since weight functions wo(@) and w;(@) ean be 
found for which Aa = AB = 0. Such a situation is given, for instance, in the 
following case: Let S(d) be the set of all points 6 for which 6(4, #) = d. Suppose 


that we can find two weight functions v(@) and v,(@) such that | vo(0) dé = 


[ v,(0) dS = 1 (dS denotes the infinitesimal surface element of S(d»)) and 
S(do) 


the sequential probability ratio test based on 


[OTT sere, +++, tye, 0] a8 


S(dg) a@ 


vo(@)[I] I(tu, ety Lna,y é)| dé 


has the following properties: (1) a(@) is constant in w; (2) B(@) is constant on 
S(d) for any d > dy; (3) B(8) is strictly decreasing with increasing d in the 
domain d > dy. Then for these weight functions we evidently have Aa = 
AB = 0. 

Let us illustrate this by a simple example. Let X be a normally distributed 
variate with unknown mean yu and unknown variance o. Suppose that we 
want to test the hypothesis Ho that » = O and that the distance of the pomt 
(u, ¢) from the set w is defined by ; 

The set S(do) then consists of all points (u, o) for which » = +50 or up = —doo. 
The set w consists of all points (0, ¢) where o can take any arbitrary positive 
value. Let r be a positive value. We define the weight functions vo,(0) and 


© ia . : 
V1-(o) as follows: vo-(¢) = - if 0 < o < r and equals zero for all other values of 
> 


, ; © oe 
o. The weight function 7;,(c) is equal to; . fO<o¢ <randp = +0 and equal 


2 


to zero otherwise. 
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Then 
a! 5 peas . 1 2 (ta ss yu)? 
Pin = [. Vir (@) (Qn)? 6" exp | a do 
ft -a Fit 1 3(te — dos) 
(6.22) = opm gi igh Ox | - see 
1 1 D(te +d 
+ = exp [ 5 =( + la 
and 
ee _ tf l : } 
(6.23) Pon = (Qr)’ > : \ L o" exp | - 9 =| om 
Hence 
4 1 S(ve — doo)? 
m 2h |- 3 ge | 
Por a > 
I = exp| - 2 |e 
(6.24) ° , 
| | 
4. = a ™ 





We consider the limiting case when r — x. Then 


| [ 3 exp Ll =a — ot l 
ol cect Yeas od 
Pin — 2 0 o - - 9  - 7 

Pr n ) a - , oa 

I o” exp| 2 =| do 

@ hai, 5 

ah gee | - 5 2 (ta + a " 
o 3 

2 om OL 7 |e 


. z exp| — | do 
oo ' 2 0 


The sequential test based on the ratio (6.25) provides a solution of the problem 
if it can be shown to have the following three properties: (1) a(@) is constant in 


(6.25) 


w; (2) B(6) is only a function of \@ ; (3) 8(@) is monotonically decreasing with 
r oO 
ee . 
increasing : . Denote oe by Z and >» (te — 2) by S*. Since the dis- 
a=l 


y 
> 


tribution of 8 depends only on e , the first two properties are proved if we 
oO 





B 
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ee a ci .| € | 
show that the ratio (6.25) is a single valued function of | & | . 


a> | 


First we show that the numerator of the ratio (6.25) is a homogenous function 


of (a1, ++: ,2%,) of degree —(n — 1). In fact, making the transformation 
¢ = XM we obtain 
re aa (Ata — = a | 1D(Ate + saa P 
— >, 4 — “ uatek »X a= . Co 
0 \o 2 o o” 2 a f 


fy Lata — dot | 1 (ta + dot 4 
=| <—_ exp| — . xp] — Th 
I ‘ao | 2 f? 7 apo | 2 f “— 


Za. (ta — dat] 1 | 1 d(x_ + dt)’ ]) 
= N"-1 Jo lt exp| - 9 ?2 | a fn exp| - 5° p | dt. 


This proves that the numerator of (6.25) is a homogenous function of —(n — 1) 
degree. Similarly, it can be shown that the denominator of (6.25) is also a 
homogenous function of degree —(n — 1). Thus the ratio (6.25) is a homog- 
enous function of zero degree in the variables 2, +--+ ,2,,. 

It can be seen that (6.25) is a function of the two expressions Sx; and Tx, 
only; i.e., 





2 . Pin (w,2 . 
(6.26) — = (2r,, TXa). 
Pon 
! / 9 ~ . . ae > ° ° . 
Let v = |V2a,). Since (6.26) is a homogenous function of zero degree, its 
value is not changed by substituting ~* for «,. Hence, 
?) 
oat - i i x ne 
(6.27) = = © Zz hs ee beeen, : 
Pon a \D v v 

1: ie . ~ 2 . 
Since $(2x%. , —XXe) = O(LxX. , LX), we see that 


Pin za y Bi 
Pon VU 


“° Ax . ° . ° ola Din - ° 
Since ~— is a single valued function of g|» we have proved that “*” is a single 
v* i On 
| 
| 


kK 


valued function of | 


Ts 


In order to prove pr 


= 


yperty (3) of the sequential test based on the ratio (6.25), 
we have merely to show that (6.25) is a strictly increasing function of | = | 
S| 
¥ . eae  |z| 
— is a strictly increasing function of I , we have only to show that 
iD} 


Since 

v? 
(6.25) is a strictly increasing function of =. The latter statement is obviously 
vp 


proved if we show that (6.25) increases with increasing value | Z| while keeping 
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v fixed. For fixed value of v the denominator of (6.25) is constant. Thus, we 
have merely to show that the numerator of (6.25) increases with increasing 


| | while keeping v fixed. This follows easily from the fact that 


exp E& _ + exp ES | 
o o 


is a strictly increasing function of | #! . 
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NON-PARAMETRIC ESTIMATION. I. VALIDATION 
OF ORDER STATISTICS 


By H. Scuerrfé anp J. W. Tukey 


Syracuse University and Princeton University 


1. Summary. Previous work on non-parametric estimation has concerned 
three problems: (7) confidence intervals for an unknown quantile, (iz) population 
tolerance limits, (777) confidence bands for an unknown cumulative distribution 
function (cdf).. For problem (iii) a solution has been available which is valid 
for any cdf whatever, but for (7) and (77) it has heretofore been assumed that the 
population has a continuous probability density. This paper validates the 
existing solutions of (7) and (77) assuming only a continuous cdf. It then modifies 
these solutions so that they are valid for any cdf whatever. 


2. Introduction. There are three problems of non-parametric estimation 
(we exclude point-estimation) for which fairly satisfactory solutions are available; 
their present status was summarized in a recent paper [4]. The purpose of this 
series of articles is to extend and complete the theory of non-parametric estima- 
tion in directions of both theoretical and practical interest. 

In this series we shall employ the following conventions of notation: We dis- 
tinguish between a random variable and an arbitrary point in the Euclidean 
space containing its domain by using a capital Roman letter for the former and 
the corresponding lower case Roman letter for the latter. Thus if X is a (scalar) 
random variable, and x a real number or + ©, we speak of the probability that 
X < xand denote it by Pr{X <x}. Roman capitals will also be used to denote 
cumulative distribution functions’ (cdf’s): A monotone non-decreasing function 
F(x) will be called the cdf of X if F(x + 0) = Pr{X < x}. The definition of 
F(x) at its points of discontinuity will be immaterial. Again, E = (Xi,---, 
X,,) will denote a random sample from a population with cdf F(x), whereas e = 


(x1, -++ , Xn) will denote a point in the sample space R,. If t is a function of e 
only, t = ¢g(e), then the random variable T = ¢(E) is a statistic. The order 
statistics of the sample FE are defined to be — ©, Z,,---,Zn, + ©, where z < 
Zo <-+-- < z, is a rearrangement of 7, t2,°--,2%,. We shall write Z) = 
— ©,Z,.;= + «.. The device of including + ~ and — ~ among the order 


statistics will enable us to avoid special statements to cover the case of one-sided 
estimation. Confidence coefficients will be denoted by 1 — a. Finally, it will 
be convenient to symbolize” the following three classes of cdf’s: Q is the class of 
all univariate cdf’s F'; Q2 , the class of all continuous F; 9, , the class of all F with 
continuous derivative F’(z). 


1 One of the authors wishes to point out the need of a clear, concise, and adequate term 
for this basic and important concept. 
2 The notation follows [3]. 
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We now list the three problems. In each case it is understood that the solu- 
tion sought is to be valid for all cdf’s in some chosen class. The names’ asso- 
ciated with the problems are (i) W. R. Thompson, K. R. Nair, (77) Wilks, (77) 
Wald, Wolfowitz, Kolmogoroff. 

(t) To find confidence intervals for an unknown quantile qg,, where q, is 
defined by F(qp) = p,0 < p < 1; in other words, to find statistics T, , Tz such 
that 


(1.1) Pr{Ti < dp < T2|F} =1— «a. 


(tt) To find tolerance limits 7, , J’. which, with confidence 1 — a, will cover a 
proportion b or more of the population, that is, 


(1.2) Pr{F(T:) — F(T:) > b| F} =1— a. 


(ziz) To find a confidence band for an unknown cdf F, that is, a random region 
R(E) in the x,y-plane such that 
(1.3) Pr{R(E) covers g| F} = 1 — a, 
where g is the graph of y = F(z). 

The existing solutions of problem (277) are known to be valid for F in 2, 
but those of problems (7) and (77) have been validated only for F in Q,. The 
extension to F in Q: is an immediate consequence of the theorem in section 4; 
this section also contains a discussion of some other implications of the theorem. 
In section 5 the appropriate modifications of the solutions of problems (7) and 
(iz) are found which extend their validity to the general case F inQ). Whereas 
Pitman ({1]; also [4], p. 310) has shown how non-parametric tests may be ex- 
tended to the possibly discontinuous case, the only solution of the three estima- 
tion problems previously extended to this case is that of Kolmogoroff for problem 
(77). Extension from Q2 to Q is of considerable practical interest, not only 
in the case of populations ordinarily considered discrete, but also as affecting 
the problem of the finiteness of the number of significant figures in measurements 
and the resulting occurrence of ‘‘ties’”’ in ranked measurements. Before making 
these extensions we discuss in the next section the transformations on which 
they are based. 


3. Two useful transformations of random variables. We shall reserve the 
symbol X* for a random variable having a uniform distribution on the interval 
from 0 to 1. Its cdf is 


. a2” < @, 
(1.4) U(a*) = Pr{X* < 2*} = 42* if 0 < z* <1, 
Ul ga” > 1. 
3 For bibliography see [4]. 


‘ The notation Pr{R | Fo} denotes the probabilityof the relation R being true, calculated 
under the assumption that the cdf of the population is Fo(z). 
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The device of transforming from any random variable X with cdf F in Q) 
to one with cdf U was early used by Karl Pearson and more recently by many 
others; it is known in the literature as the ‘“‘probability integral transformation.” 
We define the transformation «* = h(x) as follows: For —7 < x < +, 
he(x) = F(x), he(+e) = +x,hp(—x%) = —o. If F is in Q, the following 
statements are evident for the transform X* = hp(X): X* has U(2*) as its cdf. 
With X} = hp(X,), a random sample E = (X,, --- , X,) from F transforms 
into a random sample E* = (X;,---,X*) from U. The order statistics 
{Z;} of E transform into the order statistics {Z;} of E* with Z; = hp(Z,), 
~+=0,1°-'-,n+ 1. 

It is easily seen that if F is not in Q, the above transformation Y = h,;(X) 
does not give Y the cdf U; indeed, if F is not in 2 , the cdf of any single-valued 
function Y of X is also not in Q , for there will be at least one point x = x» with 
positive probability, and likewise for its transform yo. Nevertheless our argu- 
ments in section 4 depend on relating a random variable with arbitrary cdf F in 
% to the uniformly distributed X*. While it is not possible to transform from 
X to X*, without introducing a further random process, it is possible to transform 
directly from X* to X. This suffices for our needs. We shall always denote 
this transformation by XY = gr(X*). The following definition of the function 
x = gr(x*) makes it independent of the normalization of F at its discontinuities: 
(1.5) Fix — 0) < U(a*) < F(x + O). 


A sketched diagram may aid the reader in following the argument: To every 
z*(—x < x* < +) there corresponds at least one x, and this x is unique 
unless it lies in an interval to which F assigns zero probability. In the latter 
case we shall assume that some x in the interval is designated to be gr(z*). It 


will be seen that it is immaterial which z is thus chosen. However if x = —<« 
or + ~ isin an interval of constancy of F we specify gp(— x) = — ©, gr(+) = 
+2. 


To prove that gr(X*) has the cdf F(x) and thus can be identified with X, it 
is sufficient to prove that Pr{gp(X*) < 2} = F(~7 +0). Now gr(X*) < vif and 
only if x” < rt , Where 

* a 
t4 =supz. 
r=9 , (z*) 
Hence Pr{gr(X*) < x} = Pr{x* < ry} = U(x) = F(x + 0). It follows 
that a random sample £* from U transforms into a random sample EF from F. 
The transformation preserves the relation ‘‘<,’’ that is, if x. = gr (ta \, % = 
ge(xp), then x2 < a; implies x, < a. This means that the order statistics 
‘Z*! of E* transform into the order statistics {Z;} of E. We remark that 
te < 2x3 does not imply x. < 2, ; there is trouble when a <Oorz, > 1, and 
more serious trouble if x, and x, both go into the same discontinuity of F. 
However, we shall need to utilize the fact that 2. < 2, implies 77 < x . 
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4. Extension to continuous cdf’s. A sufficient condition on T; and T, for a 
solution (1.2) of problem (77) to be valid for all F in Q is clearly that the joint 
distribution of F(T;) and F(T.) be independent of F in ®. If Pr{F(T;) = 
p|F} = 0 ( = 1, 2), then (1.1) is equivalent to 


(1.6) Pr{F(T;) < p < F(T?) 





F} =1-—- a, 


and so a sufficient condition that a solution (1.1) of problem (2) be valid for all 
F in Q is again that the joint distribution of F(71) and F(T2) be independent of 
FinQ,. Weare thus led to consider sufficient conditions on a set T;, T2,--- , 
T, of statistics, which will insure that the joint distribution of F(T), F(T»), 
--+ , F(T,) be independent of F in Q. 

THEOREM: A sufficient condition for the joint distribution of F(T,), F(T2), --- , 
F(T,) to be independent of F in Q2 is that the {T;\ be a subset of the order statistics 
{Z:} of the sample. 

To prove the theorem it will suffice to show that the joint distribution of the 
set of m random variables F(Z), F(Z2), --- , F(Z,) is independent of F in Q. 
Let the cdf of the joint distribution be 


(1.7) GrQ1, Ae, ++, An) = Pr{ F(Z) <M, -++, F(Zn) < An! FH. 


Employing the transformation «* = h(x) discussed in section 3, we see that the 
above probability equals 


(1.8) Mia S &-*+. ee. S Ad, 


where Z> = : ,°°%y dS * ., are the order statistics of a random sample E* from the 
uniform cdf U. But this probability does not depend on F. 

Since the existing solutions of problems (7) and (77) are obtained by taking 
T, and T; to be order statistics, we have validated these solutions for all F in 
Q,. That the existing solutions of problem (777) are valid for F in Q: has been 
demonstrated by their authors; this is however also an easy consequence of the 
above theorem. The sufficiency condition expressed by this theorem together 
with a necessity condition of Robbins’ [2] may indicate a natural path to the 
formulation and solution of further problems of non-parametric estimation. 

From a theoretical point of view it is of interest to note that even in those 
pathological cases where no probability density function exists for the cdf F 
in Q, (F is non-absolutely continuous), the joint distribution (1.7) of F(Z), 
F(Z2), --- , F(Z,) always possesses a density. That this density is n! for 0 < 
F(Z,;) < F(Z.) < --- < F(Z,) < 1, and zero elsewhere, is evident if we consider 
(1.8). By “integrating out” the other variables we are led to the following 
practically useful result (it is well known for F in Q,): Choose any set {7;} 
of s integers (1 < m1 < 7 < --- < 7, <n), and consider the joint distribution 
of F(Z,,), F(Z,.),--- , F(Z,,). This has a probability density function f(4, 
to, +--+, te), providing F is in Q , given by the formula 
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(ry = 1)! (n = Ys)! i=l (risa = Fy 1)! 





yyri-l eat m—re 8—l rp sg yg try 
(1.9) ft, te, -++, te) _ wh (1 ts) Cis -— 


forO0 <i <t< _ < t; < 1, and f = O elsewhere. As is conventional, the 
result of applying II is to be interpreted as unity, and the meaning of f is 
given by = 

Pr{F(Z,;) S ai@ = 1, 2,---, s)|F} 


-/ [oo [tty test) dt te dt, 


5. Extension to discontinuous cdf’s. Suppose we have a solution of problem 
(z) based on order statistics and hence valid for F in Q, say, 


(1.10) Pri Zi Sq S$ Z| F} =1—a, 
where 0 < k <t < n+ 1. In particular this is valid for the uniform case, 
(1.11) Piz, <2 < Sie 1 — «. 


We now transform from the uniform cdf U to an arbitrary F in Q by means of 
the transformation x = gr(x*) described in section 3. Suppose q, is defined 
by dp = gr(p). This’ means the quantile g, of the distribution with cdf F is 
determined from the relation 


Fq — 0) < p < Fq, + 0), 


which assigns to the quantile its usual meaning if F(x) is continuous and non- 
constant at x = qg,, and a sensible definition if F is discontinuous or constant 
at q>. From the discussion in section 3 we have 


(Zi < dp < Z.) implies (Zi < p < Zt) implies (Z, < q < Z), 
and hence the probability relations 
Pr{Zx <qp < Zi|F} < Pr{Ze Sp < Zr} < Pr{Z <q < Z| F}. 
Substituting (1.11), we have 


(1.12) PriZi <q < Z:|F} Sl —a< Pr{Zi Sq < Z| F}. 


The statistical interpretation of (1.12) is the following: Consider any solution 
(1.10) of problem (7), giving a confidence interval for the quantile q, , valid for F 
inQ,. Then with the same values of n, k, t, and a, the probability of the random 
interval from Z; to Z, covering the unknown quantile g, is <1 — a for the open 
interval, >1 — a for the closed interval, no matter what the unknown cdf F. 
If F is continuous, the two probabilities are of course equal. 
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_To extend the solution of problem (72) to the general case F in % , suppose we 
have a solution (1.2) using order statistics, say T; = Z.,T2 = Z,(0<k <t< 
n-+ 1). Such a solution will be valid for all F in Q,, in particular for F = U, 

Pr{U(Zt) — U(Ze) > b} = 1 —a. 
Given now any arbitrary distribution F, we again use the transformation z = 
gr(x*). From (1.5), 
F(Z; — 0) < U(Z) < F(Z; + 0) (G = k,?). 
Hence 
B. ¢ B* < B. 


— 7S 


where 


wD 
| 
I 


F(Z, — 0) — F(Z, + 0), 
B* = U(Zt) — U(Z), 
F(Z, + 0) — F(Z, — 0). 


dS 
F, 
I 


The implications 
(B_ > b) implies (B* > b) implies (B, > DB) 
yield the relations 
Pr{B_ > b} < Pr{B* > b} < Pr{B, > d}. 
These may be written 
(1.13) Pr{F(Z, -—0) — F(Z, +0) >b|F} c<l-—a 
< Pr{F(Z, + 0) — F(Z — 0) > b|F} 
To interpret (1.13), let us say that a Borel set S covers a proportion 7 of a 


population with cdf F(x) if | dF(x) = zw. If S is an interval from 2’ to x”, 


then the proportion covered by S is F(x”’ + 0) — F(a’ — 0) if S is closed, and 
F(x” — 0) — F(a’ + 0) if S is open. The proportion covered by a point 2 
is the jump F(a + 0) — F(a — 0) of the cdf F at x. The statistical meaning 
of (1.13) is now clear: For the random interval from Z; to Z,, the probability 
that the open interval cover a proportion > b of the population is <1 — a, the 
probability that the closed interval cover a proportion > b of the population is 
>1 — a, regardless of the population. Again, for a continuous F the two 
probabilities are equal. 
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ON A TEST FOR RANDOMNESS BASED ON SIGNS OF DIFFERENCES’ 


By Henry B. Mann 


Ohio State University 


1. Introduction. It has been pointed out by J. Wolfowitz [1] that we cannot 
expect a test for randomness to be most powerful with respect to every possible 
alternative. It is therefore necessary to find tests designed to distinguish a 
random sample of observations from the same population from a sample coming 
from some particular class 2 of distributions. Such a test need be consistent 
in the sense of Wald and Wolfowitz [2] only with respect to alternatives in the 
class Q. 

Let 21,-°°-,2%n be the measurable quality characteristics of n units of a 
manufactured article. We shall assume that the distribution of x; is continuous. 
According to Shewhart the production process is termed “under statistical 
control” if 21, ---+,2%, can be regarded as a random sample of » independent 
items each coming from the same population with known or unknown distribu- 
tion function. 

In a random sample p; = p(x; > X41) = 3, where P(E) denotes the prob- 
ability that E will hold. The class Q of alternatives which we shall consider is 
described as follows. The cumulative distribution of x; is f; and the f;, 7 = 
1, 2,---, are such that 


p=tt+e, Dy & = An(n — 1), lim inf \, =A > 0. 
Such a situation may, for instance, obtain of the production process is under 
statistical control except for occasionally but not too infrequently occurring 
periods during which the quality of the product decreases, after which decrease 
statistical control is immediately restored. If the decreases in quality are sharp 
enough or the periods of decrease long enough, then the alternative will belong 
to the class 2 described before. 

To give a practical example; consider a drill, which after some period of use will 
wear off so that the quality of the manufactured article will decrease until the 
drill is exchanged. After replacement of the drill by a new one, statistical con- 
trol is immediately restored. Now, if the drill is not replaced in time, the 
periods of decrease in quality will be long and the rate of decrease will become 
rapid so that the sequence of distribution functions will satisfy the conditions 
of the class 2. A similar situation occurs also in time studies. For instance, 
in the foregoing example, the time necessary for drilling one hole will tend to 
increase when the drill is too long in use. 

The following test first proposed by Moore and Wallis [3] for the study of 


1 Research under a grant of the Research Foundation of the Ohio State University. 
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economic time series seems appropriate for our purpose: Let 21, --- , Yn be the 
sample and form the sequence x2 — %1,°-+,2%n — 2n-1. Let S be the number 
of negative differences in this sequence. Clearly, the distribution of S is in- 
dependent of the distribution of x; provided the sample is an independent random 
sample from a continuous distribution. Under one of the alternatives of the 
class Q, S will in a sample of n tend to be larger than in a random sample if A, > 0. 
Hence S may be used as a statistic to distinguish between randomness and any 
of the alternatives of the class 2. The distribution of S was tabulated by 
Moore and Wallis [3] for n < 12. They also found empirically that S approaches 
a normal distribution. The asymptotic normality of the distribution of S 
can be proved rigorously in a way analogous to the proof of Theorem 1 of a 
paper by Wolfowitz [4]. The first four moments of S were obtained by Moore 
and Wallis. The fourth moment, however, only by empirical methods. In 
this paper we shall derive a formula which makes it possible to compute the 
moments of S recursively. With the help of this formula we shall indicate an 
alternative proof of the asymptotic normality of S using the method of moments. 
Finally, we shall derive a lower bound for the power of the S test with respect 
to alternatives in © valid for large n and depending only on X,,. 


2. The moments ofS: Let P,(S) be the number of permutations in n variables 
with S negative differences. MacMahon [5] has shown that 


(1) P,,(S) = (S + 1)P,-(S) + (n on S)PrA(S ~~ 1). 


— | | ee | 
Using (1) Moore and Wallis [3] have tabulated P( tn a >is— _— ). 
In using their table for our purpose, one has to keep in mind that we are using 
s : ‘ a : , — 1 re 
a one tail region; therefore P(S > S) is for S > a —— one half of the value 


~~ 


tabulated by Moore and Wallis. 





. : én eo 2 5 . 
Clearly the first moment of S is , since the expected value of — signs 


equals the expected value of + signs. To find higher moments we multiply (1) 





|: — . 
by (s adi ) divide by n! and sum over S. Then we obtain 


. n-1V¥]_1,° , n-1yV,, 
a af(s—"52)]-teal(s—"52) 6 +0] 
+ : Baal (n —S- n(s a2 4- ') | 
n 2 


where E,[f(S)] denotes the expectation of f(S) in permutations of n variables. 
From (2) we have 
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Putting S — E(S) = a we obtain 














(3) Ena’) =~ Enale(e — P! - 2(e + 91 + alle + + @- DL 


> 
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From the symmetry of the distribution as well as from 3 it may be seen that 
all odd moments are 0 and therefore 


Sle + + @— 4") = E+ 9)™ 
Efa(a — 3) — a(x + 3)"] = —2E[(x + 3)" + E(a + 9)”. 


Hence we obtain from 2 
(4) Ea") = 0, ¢ = 0, 1,--- 


I A 





- 1 é 2 é 
E,(a*) = "= E,al@ + 3)") — = Enal@ + 9)". 
If all moments below the 27th moment are known (4) becomes a difference equa- 


tion whose solution yields the 27th moment for n > 27. Thus one obtains 


- 4 
o(S) = E,(z*) =“ ss L pat) = 5M + cole +1) 


6 35(n + 1) — 42(n + 1)? + 16(n + 1) 
ah = a 





27 
It isnot difficult to prove from (4) by induction that lim a5 = (27 — 1)(27 — 3) 





noo 

- 3.1. To do this one proves first by induction that E,(a”') is forn > 2ia 
polynomial in n of degree 7. It can then be proved by induction that the first 
coefficient of this polynomial is (2i — 1)(2i — 3) --- 3.1/12° from which the 
assertion follows. Since (27 — 1) --- 3.1 are the moments of a normal distribu- 
oo 


5 ) V12 
tion with variance 1 it follows that ~—___—-1—— is in the limit normally 
Vn+1 








distributed with mean 0 and variance 1. This result follows, however, also 
easily from Theorem 2 of a paper by Wolfowitz [4]. 
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It is also possible to show by induction from equation (4) that for n > 27 the 
2ith moment of S is smaller than the corresponding moment of a normal distribu- 
n-+ 1 


tion with variance 12 





3. The power of the S test. Let us assume now that one of the alternatives 
of the class © is true. This is to say p; = P(x; > tins) = $¢ + GG, 
De = ,(n — 1), liminfA, = A>O0. Let 


1 if the 7th sign is —, 
" \0 if the 7th sign is +. 
We shall show that 
P(tjar = 1|2, = 1) S P(@41 = 1). 
We have 


| [arson [afte | attend < J apaces [° apaces J ates 
< [apes] [ante J" ates.» 


71 Zi z2 
Adding [ dfo(22) | [ dfo(x2) [ apse) to both sides of this inequality we 


have 


[FE atetas) [ atutes) < [aes ! [Catan [ afvtan. 


Integrating both sides with respect to 2;, we obtain 


[- df; (x1) ‘ dfo(22) [ : dfs(2s) 
< | ~ afte) [ apie) | ! [anton [° apse) | 


Pia, =1 and z2=1)<P(a 


or 


I 


1)-P(2. = 1). 


_ 


From this it follows that o.,:,,, <0. Since o:, = + — e{ we have o, < 
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4 








Zo “ae : : ? (1 — 4n2). Moreover E(S) = ~ z ; + An(n — 1). 


Let ’ = AifA < 4 and0 < ’ < AifA = 34. The critical region is for suffi- 
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ciently large n given approximately by S > — “a where ¢ 
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depends on the level of significance a and must be chosen so that 7 [ 
ce ** dx = a. Hence, if we can show that under any alternative H of the class 
Q and for any e > 0 


—t 
(5) P(S > E(S) — 5 V(n — 1)(1 — 4X”)) < a [ edz +e 


for every t > ¢ > 0,n > N(e, H, ®), then we shall be able to give a lower bound 
for the oo the S test. The power of the S test is approximately given 
n+ 1 

3 





From (5) we icine 


p(s 225 fz 7a] . et" dz —e 


tr/ n+1—2An (n—1) 0/3 


(6) a/(3n—3) (1—4n’2) 
trv/n + 1 — 2ra(n — 1) V3 
V3(n — 1)(1 — 4A”) 





for < -i <0, n > Ne, H, i). 


The author considers it safe to assume that (6) holds with a fairly small e for 
n > 12 if \’ in (6) is replaced by X;, where X,, = An if A, < Zand Xr, < dif A, =3 
and if X;, is not too close to 3. He bases this belief on the rapidity with which 
the distribution of S approaches normality under the null hypothesis of random- 
ness, and on the fact that at least under the 0 hypothesis the moments of S are 
smaller than the corresponding moments of a normal distribution. It may also 
be seen from the following derivation of (6) that in many cases the power of the S 
test will be considerably above the lower bound given in (6). 

To prove (5), we need the following two lemmas 

Lemma 1. Let P(x < t) = f(t). Let further E(z) = 0, E(z’) = «. Then for 
every 6 > O 


(7) ft+)+5>Pet+2<5) 2ft—s) — 5. 


Proor: Applying Tschebycheff’s inequality we have 


Pia+2z2<t)<Pa<t+6)4+ Pa >t+éandz < —6) 


<P@<tt+s+PE< —-) <ftt+s +5, 


Pia +2z2<t) > Pia <t — d6andz < 3) 


>P(a<t—56)—-Pee= 8) > ft — 3) — 5. 





ose st 232-3 22 
*) 
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Lemma 2. Let {x;},7 = 1, 2, --- be a sequence of independent random variables 

with mean 0 bounded kth absolute moment, k > 2, and variance o;. Let M > 0 

‘ o; 2 ; Mit -+- +2 

and lim sup *< M’. Form the sequence of random variables y, = ——>——=— 
~~ n MvV/n 


then for any e > O and anyt >t>0O 
1 is 2 
,<-it)< =z Steal > t). 
(8) Pyn<-bH< a L.° dx +e for n>WNfe, t) 


Proor. Form a sequence mz with lim mz = 0. Let yn, = 2% + 2% 


a->0o 


where 


». ai Pee © hn 2 2 





se =, = = 
M<VJ/n Mv n 


>.” denotes summation over all i for which ¢; > ma and all sums extend from 
one to n. 
Let f% be the distribution of 2% then by Lemma 1 


9 


m 


m.. a Qa 
fu(—t + 8) + 7 = Pn S —#) 2 fa(—-t — 8) — H- 


Now we distinguish two cases. 

1st Case. The number of integers i with o; > mz, is for some a@ of order n. 
In this case {f%,} differs arbitrarily little from a sequence of normal distributions 
with mean 0 and the upper limit of the variances at most 1. 

2nd Case. The number of integers 7 with o; > mz is for every a of smaller 
order than n. In this case x% converges stochastically to 0. In both cases 
(8) holds true since ma can be chosen arbitrarily small. 

We can now prove (5). It follows easily from Tschebycheff’s theorem that 
(5) is true if \ = 4. Hence we may assume \ < 3}. Let 2; be defined as at 


the beginning of this section. Form 
- 2(z; — E(zi)) 
GTi V(n — 12) = 2)’ 
V/(n — 1)(1 — 402)’ * pt JH — 10 — 2) 


where m’ = gk is the largest integer multiple of k which does not exceed (n — 1). 
We form further 


k 
oo 


— 
ll 
S 
. 
ll 
S 





k k & k 
2, = vj, 2, = Uj. 
j=1 j=1 
Si a < ag — 1) it follows f Lemma 1 that 
Since ot < im — Da —-d) ~ ka — 2’ it follows rom MM: a 
2(S — E(S i 
the distribution of Fea differs arbitrarily little from the distribu- 


eT eee Tr eae eS R 
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tion of 2‘, for sufficiently large n and k. The second and the third absolute 
moment of +/n — 1 v; are bounded. Hence ~/n — 1 v; fulfills the condi- 
tions of Lemma 2. The application of Lemma 2 yields (5) and conse- 
quently 6. 

The integer N(e, H, ¢) is independent of ¢ provided the lower limit of the 
integral does not exceed —i?. Hence we have proved 

THEOREM. Let t,, t2,--- be any sequence of numbers satisfying the condition 
7, = ent 1— 2m —ImV3 < _jp eg 

V(8n — 3)(1 — 4X”) 

where \’ = lim inf i, if lim inf A, < 4. and0 < ’ < 3 otherwise. Let P,(S, H) 


no no 











be the power of the S test with respect to the alternative H and critical region S = 
o- > os :, Then 








2 + tn i 


1 oo 
(9) lim inf | Pa(Siz) / Te [ ” tal az | > 1. 
no tn 


It is worthwhile to remark that (9) issharp. That is to say there exist alterna- 
tives for which the left side of (9) is equal to (1). This is obviously the case 
for any alternative with P(x; > aj41) = 4 + A and P(z; = 1 and 2441 = 1) = 
P(z; = 1)-P(zi41 = 1). These conditions are, for instance fulfilled by the 
alternative given by P(aj41 = a —8 —--- — 8) = 34+, Pin =CH+EH+ 


| 


26 
--» +6') = 4—A,7 = 1,2, --- where (a — c) >j_37 9%: 


If t, = t for every n then (9) implies the consistency of the test if the order 
of X, is larger than 1/+/n. It may also be seen that the test is not consis- 


1 
tent with respect to alternatives for which X, is of order at most equal to a ‘ 


This remark refers of course only to alternatives for which x; is independent of 
x; fori # )- 
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THE ASYMPTOTIC DISTRIBUTION OF RUNS OF CONSECUTIVE 
ELEMENTS 


By Irvina KApLANsKyY 
New York City 


In a permutation of 1, 2, --- , n let r denote the number of instances in which 
zt is next toz + 1, i.e., in which either of the successions (7, 7 + 1) or (¢ + 1, 2) 
occurs. Thus for the permutation 234651, , = 3. In [3] Wolfowitz’ has pro- 
posed the use of r for significance tests in the non-parametric case, and in [4] 
he has shown that asymptotically 7 has the Poisson distribution with mean 
value 2. It is to be noted that W(R), the number of runs as defined by Wolfo- 
witz, is equal to n — r. 

In this note we shall derive more explicit results concerning the asymptotic 
distribution of r. In a random permutation (all permutations being regarded 
as equally probable) let the probability of exactly r successions as above be 
P(n, r), and let M(n, k) denote the k-th factorial moment of the distribution, 
that is 


M(n, k) = Spr(r — 1) --- @ —k + 1)P(a, 1). 
We shall show that 


el _k+1/fk\k k+2/k kk-1)_ 
ee es E Qe (7)! + PE (5) n(n — 1) | 


_ we" r—3r , rf —8r 4 Or + 22r — - -3 
@) Pian) = Z| ER ER tom, 














Since 2" is the k-th factorial moment of the Poisson distribution with mean 2, 
either of these results serves to verify the asymptotic Poisson character of the 
distribution of r. 

It would be possible to obtain some kind of explicit formula for the general 
term of (2), but there seems to be no reasonably simple form. 

Proof of (1). Let A; denote the event ‘‘z + 1 comes right after 2” and B; 
the event ‘7% comes right after 7 + 1” (@ = 1,---,n — 1). The joint prob- 
ability of k of these 2n — 2 events is either 0, if they are incompatible, 
or (n — k)!/n! if they are compatible—for in the latter case we in effect assign 
positions for k of the elements and are then free to permute the n — k others. 
Let f(n, k) denote the number of ways of selecting k compatible events. Then 
it is known that ({1], eq. (40)) 


(3) M(n, k) = kif(n, k)(n — k)!/n! = fn, k), ) 


1T am indebted to Dr. Wolfowitz for calling my attention to this problem, and to its 
identity with what I called the ‘‘n-kings problem” in [2]. 
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The relations of incompatibility can be summarized by the statement that 
A; is incompatible with B;if |i —j| <1. In view of (3), our task thus reduces 
to the proof of the following combinatorial lemma. 

Lemma. Suppose 2n — 2 objects Ai,---,An1+, Bi,--+, Bus are given. 
Let f(n, k) denote the number of ways of selecting k objects with the restriction that 
A; and B; must not both be chosen when |i —j| S 1. Then 


(4) fo. 8). + (— yA = })- 


Proor. We split the acceptable selections into two subsets: those which 
include A,-1 and those which do not. Let the latter be g(n, k) in number. 
Since the selections which include A,_; must omit B,_; and By_2, it is clear that 
they are g(n — 1, k — 1) in number. Thus 


(5) fin, k) = gin, k) + gn — 1, k — 0). 


Similarly we split the selections which omit A,_; according as they omit or 
include 8,_; ; we obtain 





(6) gin, k) = fin —1,k) + g(n —1,k -— 1). 
Elimination of g from (5) and (6) yields” 
(7) frn,k) =fn—-—1,k) +f(n-—1,k —1) + f(n — 2,k -— 1). 


We can now make an inductive proof of (4). Assuming (4), we have 


fin,k) —f(n—1,k) _ y vit e-i-4 
a k-i-1 
f(n — 2,k — 1) 


; waar 2A! at n-t—l1 - (77573) 
Qk-1 iit 2i(k — 1) k-i-1 k-i-2 
_ op _vi(n-i- k+i-1(k-1\, k+i-2 am) 
_ “=i: [hein (es )+ a= (; )I 
In view of the identity 
bs(?) Reset (hot) Ret 3 Ro )) 
“hk NJ RHI k-1 \V-1 


we now readily verify that the right hand side of (4) satisfies (7). To complete 
the induction we must check the appropriate boundary conditions. According 


to (4) we have 
flk,k) _< ik a 7 
Qk " 2 , las 


f(n, 1) = 2n — 2, both as they should be. 





2 This recursion formula is essentially the same as equation (20) in [2]. 
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Note. There are various other formulas for f(n, k); we have selected (4) as 
it exhibits the asymptotic behaviour best. In an unpublished investigation 
John Riordan obtained a neat representation as a hypergeometric function: 


fin, k) = 2(n — kK)FO — ky 1 + k — vn; 2; 2) 


and derived corresponding recursion formulas. Essentially the same result 
was given by Wolfowitz [3]. Still another formula given by Riordan is 


fin, k) oes I> 


A symbolic version is given in §5 of [2]. 
Proof of (2). From the formula of Poincaré ({1], eq. (29)) 


r!P(n,r) = 5 (—1)"" M(n, k)/(k — 1)! 
k=r 


° — . . =< , - e 

or, in a cabalistic symbolic form, P(n, r) = Me’ /r!. We substitute the suc- 
cessive terms of (1) and we may let the sum run to infinity at a cost of 0(n~”) 
for any positive m. The first term contributes” 


> (—1)" 2 /(k — r)! = 2’ SS (—2)*/i! = 2". 
k=r 1=0 
Again since 
P+k=(k-n(k-r—-1)+ 2Qr+22(k-n+rtr, 
the next term yields 


20 : 
y (-y'(#e + HAVk - vn)! = We? (2 ~%-2+4" +"), 
k=r 





and so on in obvious fashion. 

Some indication of the asymptotic behavior of P(n, 7) is afforded by the fol- 
lowing table for n = 10. It is to be noted that, because of the form of (2), 
the approach to Poisson is much more rapid for r = 0 and 3 than for other r. 








r P (10, r) Poisson ——— 
0 132 135 | 135 
1 .300. 271 | 298 
2 305 | 271 298 
3 179 .180 .180 
4 .065 | .090 | .072 
5 015 | .036 .018 
6 .002 | 012 .001 
7 .000 | .003 — .001 


3 My thanks are due to Mr. Riordan for correcting an error in this section, and for many 
helpful suggestions concerning the entire paper. 
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ON THE APPROXIMATE DISTRIBUTION OF RATIOS 
By P. L. Hsu 


National University of Peking 


The purpose of this paper is to apply Cramer’s theorem of asymptotic expan- 
sion’ and Berry’s theorem’ to study the approximate distribution of ratios of the 
following two types: 


(I) Za Loite + Yn) SLR t+ + Bn) = PR, 


(I) Z=¥ [2x4 + Xn) = ¥/E. 


In (I) the X;, Y; are independent, the Y; are equi-distributed,’ and the X; are 
equi-distributed and positive. In (II) X1,---,X,, Y are independent and 
positive, and the X; are equi-distributed. 


1. The ratio (I). Assume that (I1) the absolute kth moment of X; and that 
of Y; are finite and positive, where k is a fixed integer >3, 
(12) the distribution of X; and that of Y; are non-singular. 

Let 


9 


t = €(X,), =<(Y), oe=(X)-F, P=d¥Y)-7 


U = (T =p, v=" (PF - »). 


Let F(x), G(x) and H(z) be respectively the distribution functions of Z, U and 


V. Let 
ox ey as 
— (<= ‘+ Se 


Then the relation Z < x is equivalent to 


es 2S . « 
“iva tan 


1H. CramtrR. Random Variables and Probability Distributions (1937), Chap. 7. 

2A.C. Berry. ‘The accuracy of the Gaussian approximation to the sum of independent 
variates’, Trans. Amer. Math. Soc., Vol. 49 (1941), pp. 122-136. 

’The Y; are said to be equi-distributed if all Y; have the same distribution function. 
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For simplicity we shall assume x > 0; the results are, however, general. Then 


U V 
the distribution functions of — im and Jk are 


a hw nai) V (ov =) 
Hence, by the theorem of convolution, 
oo ¢ ba 
(1) F(x) = [ jl - o( -bvne — 9) an (?¥™)., 


Here we recall the theorems of Cramér and Berry: Under the conditions (11) 
and (12) 








(2) Ge) = @(2) + . SD + a 





where 


1 v 
i—=_1 ~~ - ptt 
%(%) = 75, [« dy, _—~P,() 2, Cp P(x), 


and | D,, | is less than a positive number which depends only on k and the distribu- 
tion of X;. If k = 3, condition (I2) may be removed.* 


Analogously, 

; Dy. 
(3) H(x) = ®(x) + : * ae + + 5. 
where 


Q(x) = > dj b”**” (x). 


In the sequel we shall use the letter A; to denote an unspecified quantity such 
that | A; | is less than a positive number which depends only on k, the distribu- 
tion of X; and the distribution of Y;. 

Using (2) we have 


(4) 1— G(- 2) = oe) + Pe) 5 D; 


yvl2 mi-2) 





and this making this substitution in (1) we get 


Fee) = [7 02RD) an (V0) 


= m2 mik-2) ? 








4 This last assertion constitutes Berry’s theorem. 





sewer - ==. 
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and so by partial integration, 


rey = [a(t 9 » oo (°~/) 





oak os 
Making the transformation y = o2xv/b+/m and writing 


bv/n gaoVne 
T ; rv/m 





(5) a= 


we get 


k-3 








A: 
= lo + << = + a) * 
For Jo we use (3) and obtain 
I= [ P(au — Bv)b’(v) dv + 7 ph ck. Q,(au — Bv)&’(v) dv + 


For J, we use (3) with k replaced by k — v. Thus 


k— - 


I, = [ B(au — Bv)P)(v) dv + Jf Q,(au — Bv)P,(v) dv + 


Combining these results we get 





(6) Fie) = [| ®au — f)0"@) do + > 





k-—3 v 2 
7 [eau — BoP iv) do 


y=] m?!2 


pal m= nel 


where 


follows easily from the theorem of convolution that 


| P(au — Bv)&’(v) dv = P'(u). 


Ai 


mi (k—2) 


F(z) = [ H(du — Bv)®'(v) dv + > — rh H(au — By)P, (v) dv + aa 


Ax 


nitk—2) * 


nik— = - 


ae Q,(au — Bv)’(v) dv 
-" ) 


k—3 k—3—»p (—1)’ si , 
+ a > wel Q.(au — Bu)P,(v) dv + Re, 


_ _” ss ee 
Ri - mi &-2) + ots ) + L _.%. oa ters 2») Ax (7 > Ta) Fi 


Now by (5), « > O and a’ — gp’ = 1. For such values of a and 8, however, it 
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As differentiation under the integration sign is justified by the boundedness of 
the derivatives of & we have 


a” [ 6” (au — Bv)d"(v) dv = &”(u). 


Repeated partial integration then gives 


[ &” (au — pv)?” (v) dv ef e?t2) (au — Bv)'(v) dv 


a: 
= eae iia, 





Hence 
r Q,(au — Bv)*'(v) dv = : dip [ @”*?)) (ay — Bv)'(v) dv 
=> er), 
[F (ou — poyP $0) do = Bex [Bleu — Arye? () do 
- > err, 


[ 4 Q,(au — pv) Pi(v) dv = > = din Civ | b**) (au — pr)b?***” (y) dv 


t=] j=1 





Bt utrtzitad) 
om > = din Cir orev TaTai petit cy) 


t=1 j=1 


Making all these substitutions in (6) we obtain the final result 


F(t) = ®(u) + : —— > te BE" y le > J me prt? (, u) 











1 nl? & 
k—3 k—3—pv p” v+2) 
ce (u+v+2i +27) 
+ 2d 4 m’? mab : dinCy op th t2i +25 D eee (u) 
v= p= t=] 7= 


1 1 \e 
+ At is a/m +7) . 
If k = 3, the result remains true without the condition (12). 
2. The ratio (II).° Here we make the following assumptions: 
(111) The kth moment of X; is finite and positive, where k is a fixed integer 
> k, (Xi) = 1, &(Xi) — 1 =o”. 
(112) The distribution of X; is non-singular. 


5 As the case e (X;) = 0 is excluded, there is no loss of generality in this assumption. 
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Let U = VWm(X — 1)/c, and F(x), G(x) and H(zx) be respectively the 
distribution functions of Z, U and Y. Then 


F(x) yx 
2) = = <2. 
/ m~ 
Because of the positiveness of X; and Y we may always assume x > 0. Then, 
by the theorem of convolution, 


F(z) = cs i. “ dH(y). 


Using (4) we have 


where, as throughout the rest of this paper, A; represents an unspecified quantity 
such that | A; | is less than a positive number depending only on k, the distribu- 
tion of X; and the distribution of Y. By partial integration we get 


00 ( acs k—? _ p(w ay *) ; 
(7) F(x) = [ H(x —y) neds — FP, a n+ -, 


. m* 
m2 


” ox ; 2 _ Ax 


An interesting special case is the following: Suppose that (113) H(z) 
exists and is continuous for all x > 0; (II4) the functions 


i£(z) = H(z) (v = 1,-+-,k — 38) 


are bounded, i.e 








E(x) == a 
(II3) there is a positive constant c < 1 such that 
aH (y) sas Ax 


for allz > O and (1 — c)x < y < (1 + c)x. Under these conditions we have 


oxz k=3 (=1)'0"a'2, H” (x) 


(-n"e 2 ee yk — H*( va) lsc 
Tk = 9)! eas ~ ~ H' z+ Va (|8| <1), 





we have 


.s k-3 (7 __ ska? 
(8) u(z a v=) ae + i 1)" o E,Z" se Ape” 


Vim = vim? mik—-2) 


4 
and so, for .z| < 


cvim 
o 





™ 


NF OE NR, glia AIT SO IN TEE 4 a, 


wereneairim = 


Perr 
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Separate now the integral in (7) into two parts: 


I; =| a Is = [| _ 
|2|<c+/m/o j2|>e+/m/o 


Now 


i k-3 i ‘or z 
#@) + > A) 








\I2| < / dz. 
|z|>e+/m/o 


Evidently this last integral is exponially small and so is A;/m**~. By (8), 


. > ee <2 (-1)'P- -™ 
h; [AZ = \V(e@o+d ds + 








=o =v m2 m?!2 = 2) 
ee (—1)’0"&z , (-—1)’P, @) 
7 [. (5 ors vi oplmr? 1G () + 7 ore mvr!2 2+ _— 


Combining these results we obtain 


k—3 k—3 
F(z) = . (3 ym Ve BV ore +) ye are ey) de + its 


p!m?!2 
3 on 


ca. E k-3 k- 2. A 
-> oo Int D > Sob ®,, ni + 


fs SX mior) 4(k—2) 


“i? a? _— 


where 
lea = [ 2b” (2) dz. 
Now the following facts can easily be established by means of partial integration: 
(9) Ig = O when a — 8B 1s even, 
(10) Iaa = 0 when B-—a>l. 
By (9), the non-vanishing terms in >; are the even terms and the non-vanishing 
1 
terms in }> are those for which » + viseven. Hence 


[H=3)] gy 


pm ee 7 Cv Sav 
1 


— m ” 
[$(k—3)] [3(k—3)] Qu [3(k—4)] [$(k—4)] Qu+] 


) a xe a st Se Toy, 2u+2j+1 + Cin Sart 


=0 pal jel y=0 p=0 0 jm meter 





M 





I 27+], 2u+2j+2 « 
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Using (10) to reduce >> further we get 


z 


[3(k—3)] vp—1 2p [$(k—4)] v—1 2u+1 


/ 
ee tht Ce ken 





2 ot pal fat "err" —a apt wer 
{4(k—7)] » [4(k—6)] v / 
Juv Sov44 Juv foy43 
- vet + fae ae 
v=0 u=0 metets 2, d arr 
[4(k—9)] 1 a [4(k—6) ] 1 a A; 
= — 2», hutent 2 — pée+3 + 
a-o mts B=[($(a+1)] m amo «met? metre ” mi k-2) 
[4(k—3) ] 1 t—3 4 1 > 
= ae liz gajz4 + —. li; bj43 + 
i=3 m 7=[4 (1-2) ] #723 jan? m* j=(4G—-1)] #7 $27 a 
Hence 
/ [3(k—3) ] 
erge €2 & + Loo &3 l 
+ = + — pe -- 
2 » m m? 2 m” 
< = , Ak 
Cy bay > ly» Ean44 + Ly» &2043 J+ <a 
u=[4(v—2) ] u=[}(v—2) ] m r 
= & + oe = z= Dit; + s- 
Hence 


F(x) = & + ae e Div Ej; + “a 


M” jart mik-2) * 


Our final conclusion is: Under the conditions (II1)—(I15) formula (11) is true; 
if k = 3, (11) remains true without the condition (I12). 
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ON THE DISTRIBUTION OF THE SERIAL CORRELATION COEFFICIENT 


By HERMAN RUBIN 


Cowles Commission for Research in Economics, University of Chicago 


The distribution of the serial correlation coefficient, in samples drawn from 
a parent distribution with zero serial correlation, has been studied by many 
authors. Anderson [1] obtained the exact distribution. Dixon [3] and Koop- 
mans [4] have given approximate distributions, each attained by smoothing the 
characteristic values of the numerator of 7 in (1) below. Dixon smoothed the 
characteristic values in the generating function and obtained his results by 
comparing the moments of the exact distribution with those of the approxima- 
tion, of which the first 7 are found to be exact. Koopmans smoothed the 
characteristic values in the exact distribution function. Here we evaluate 
Koopmans’ result and show that it is the same as Dixon’s approximation. It 
thus appears that in this case it is immaterial whether the characteristic values 
are smoothed before or after inverting the characteristic function. We also 
add Tables comparing confidence limits for the exact distribution, for the ap- 
proximation referred to, and for a normal approximation. 

We define the serial correlation coefficient as 


° 
2 Ue Vi+1 


(1) fae 


——— 4 eae = Sie 


- Then Koopmans obtains, if the true value p of 7 equals 0, and the x, are nor- 
mally and independently distributed with mean 0 and variance o’, the ap- 
proximate distribution 7/2 — 2. 


9} 17 a jt) are cos 7 Ps 
(2) h(7, T) = Mh — I (cos a — 7)*7 ~“ sin $Ta sin a da. 


Although in the distribution problem T' is a positive integer, it is useful to 
consider the right-hand member of (2) as the definition of h(#-7) for those 
complex values of 7 for which it exists. 

Let R(T) denote the real part of T. If R(T) > 2N + 2, we obtain 


iT ( a 
# ig, 7) = 2 OT) gr _ nar -s)...ar-N-1 
« = T 
(3) arc cos rT 
. | (cos a — 7)*7 *-% sin 4Ta sina da. 
0 ' 
Now, according to [2], tables 41, 42. 


a/2 
1T-2-N — 
| (cos a)*” sin 37a sin a da 
0 


(4) _3Tr r37 — N — 1) 
~ 2 * par — N+ ))rGd — N))’ 
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ed qv . 
Deonote by h‘”’(0, T) the value of a h(¥, T) for 7 = 0. Then for R(T) > 


2N + 2, 
: ‘ais 7 (-1*2*raQr+i1) | 
(5) he 0, T) = ra(? — N + 1))r@d — N))* 


h(7, T) is analytic in # for | 7| < 1, R(T) > 2, and is analytic in T for | #| < 1, 
R(T) > 2. It follows by Hartogs’s theorem [5] that h(7, 7) is analytic in 7 
and T for |#| <1, R(T) > 2. By analytic continuation we get that (5) holds 
for R(T) > 2. Consequently 


(6) If N is odd, kh (0, T) = 0; 
(7) if N is even, 
OT) *rar+yra  _ 
h(0, T) ra(r7 — N + 1))rgda — ))° 
Let N = 2P, then 


1 kh?” O, 7) 
(2P)! nO, T) 


6 C4, oe ee ey 

~  (2P)! 2 2 2 QP 
ots SS — 

~ (2P)! 2 5 2 P! 


aie 1 | d* . (1 ~ py tT "I 
~ Pl fd()}? a 


According to (5) 





(8) 





T lM r(3T + 1) ee ee I cesiicatciaeitit 
(9) h(0, T’) “a raT + 1)1(3) a fis al J pryit l d7* 
Hence 
(10) hr, 7) = PGT + va — Ay 


ra@T+3r@)’ 
which is the same as Dixon’s expression (3.22). 

A more elementary proof by complete induction for integral values of T' can 
be based on the recurrent differential equation (14) which is of interest in itself. 
To this end we shall write (2) in a different form which is easily obtained through 
partial integration. 


sa g't ‘ 1T’ are cos 7? ir-1 
(11) h@, T) = ——— | (cos a — 7)’ cos 47 a da. 
T 0 
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Differentiating with respect to 7, 











_ 17T = iT are cos fF iti 
h' (7, T) = 3037 — 1)2 [ (cos a — 7)” * cos } Tada. 
| " ° 
pig __ \T are cos 7 a 
= 271 wee 12" [ (cos a — 7)'"? (cos 3(T — 2) acosa 
Tv 0 
— sin 4(T — 2) a sina) da 
12 ina — iT arc cos ¢ as . 
(12) = a | (cos a — #)**' cos (1 — 2)a da 
TT 0 
1 im _ sT pare cos 7 iia 
an 2031 — 12 [ (cos a — 7)” ” cos 4(T — 2)a da 
T 0 
wpyigqy __ iT arc cos f ‘ " 
4 ar — va a pie 
Tv 0 
‘sin 3(7' — 2)a sin a da 
i. 1 T a 1 iT . arc cos fr satel 
oe ft (cos a — 7)? 
(13) * ’ 


-cos 3(T’' — 2)a da, 


because the first and third terms in (12) cancel as may be shown by integrating 
by parts. 
Hence (13) reduces to the recurrent differential equation 


(14) h' (7, T) as —2-3T7Fh(F, m 2). 
Let us now assume that 


| (37) 


~ L/z sm _ ie ee a — =2 }(T—3) 
(15) h(?, T 2) rar ime 1)T(3) (I 7) . 


Then (14) becomes 


lp a Ee 

~ af — 4 PGT — )r@) 
TQT+1)_ 
GT + 2)FQ) 


h(7, T) a — #3 
| (16) 

| = —2F-(7 — 1) = 
| 

| 


<2) }(T—1)—1 
i--r””. 


Integrating, one obtains 


PQ? + 1) 
POT + 3)0() 


a — #)™, 


(17) h(7, T) = 


No constant of integration occurs because (17) agrees with (5) for 7 = O and 
N = 0. 
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It remains to prove the validity of (17) for the initial values 7 = 3 and T = 4. 
If7 = 4 


h(7,4) = . [ . sin 2a sin a da 
T 0 
(18) 


8 - r cos r 8 si r(3) sia 
= —sin et co. 1 — oh a | semen 1 al <\2 . 
oT on 0 3r ( a r(3)T(3) ( *) 
For T = 3, 
193 arc cosT .: 3 : 
(19) h(F, 3) aa 22° | sin aa Sim a a 
" & V cos a — 7 


Substitute cos a = 7 + (1 — #) sin’ 6. We get 


- 2 l —f?T o 9 - 2 2 
h(7, 3) = a(1_— 7) | {(1 + 27) cos’ 6 + 2(1 — 7) sin’ 6 cos 6} dé 
™ 0 
” (3) 
<2 3 -2) (3-1) 
= 211 —#) = —*_ «1 — #)! 
a? es 
which completes the proof. 

A short table of confidence limits is included, corresponding to the 5% and 
1% significance levels, comparing the exact distribution given by Anderson [1] 
(the values in parentheses being graphically interpolated by him), the distribu- 
tion (10), and the normal curve with the same mean and standard deviation. 


Confidence limits for 7 






































| 5% | 1% 
: Exact | (10) Normal | Exact | (10) | Normal 
3 | .854 | .729 | .736 | .970 | .882 | 1.040 
4 713 .669 | .672 | .898 | .833 .950 
5 | .622 | .621 | .622 | .823 | .789 .879 
6 .570 .582 | .582 .762 .750 . 823 
7 | .545 549 |  .548 714 .715 .775 
8 (.521) .521 | .520 | (.682) .685 .736 
9 .498 497 | .496 656 .658 .701 
10 (.477) 476 | .475 | (.633) 634 .672 
11 | .457 | .458 | .456 | .612 612 645 
15 | .400 | .400 | .399 | .543 | +543 .564 
20 | (.351) 352 | .351 | (.480) | .482 .496 
25 | .817 | .317 317. | .437 437 .448 
30 | (.291) .291 .291 | (.404) .403 411 
35 | (.271) 271 .270 (.377) .376 .382 
40 | (.255) 254 254 (.355) .354 359 
45 .240 .240 .240 £335 .335 .339 
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It is thus seen that the distribution (10) provides satisfactory significance levels 
for T > 9 whereas the normal approximation provides satisfactory 5% signif- 
icance levels for the same range. The normal approximation appears to be 
unsatisfactory, however, at the 1% significance level even for JT as high as 45. 
The normal approximation here used is not the same as that used by Anderson 

; VT —_ 
({1], p. 53), which assumes —_———— to be normally distributed. 
V1+ 2P 

The following table shows a comparison between a few more confidence limits 
of the Type II curve (10) and the normal curve with same first two moments 
for a few values of 7. 


Confidence limits for 7 











| aM | 4% | % | 2 Ci 1% 
yi | | 

(10) Normal; (10) Normal) (10) Normal | (10) |Normal!} (10) | Normal 
15 | .400| .399 | .423 | 425 | .452| .456 | 488 | .498 | .543 | .564 
20 | .352 | .351 | .373 | .373 


337 


| | 
.398 | .401 | .431 | .438 | .482 | .496 
| 


25 | .317 | .317 | .336 .360 | .362 | .390 | .395 | .437 | .448 
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NOTES 


This section is devoted to brief research and expository articles, notes on methodology 
and other short items. 
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A NOTE CONCERNING HOTELLING’S METHOD OF INVERTING A 
PARTITIONED MATRIX 
By F. V. WavuGu 
War Food Administration, Washington 
Professor Hotelling recently presented several methods of computing the 
inverse of a matrix.’ Among these was a method of partitioning a square matrix 


of 2p rows into four square matrices, a, b, c and d, of p rows each, resulting in 
the partitioned matrix, 

a b 

c d |" 


The inverse of this matrix can also be written as a partitioned matrix, 


A C 
B D| 
Then, multiplying the original matrix by its inverse we get four matrix equa- 


tions, 


aA+bB=1 aC + bD = 0 
cA+dB=0 cC + dD = 1. 


These equations can be solved for A, B, C, and D. 


Professor Hotelling’s solution requires the inversion of four p-rowed matrices. \ 
It is possible, however, to solve these equations by formulas involving only two f 
inversions. ‘The formulas are ; 

—1,\-1 —1 ' 

D = (d — ca b) B= —Dea 
Y —1 —] —l & 
C = —a bD A=a —a DB. 


As an example of the procedure let the given matrix be 


26 -10 15 32 
19 45 0 -14 -8 
—12 16 | 27 13 | 
32 290 —35 28 


1Haroutp HoTetiinc. ‘‘Some new methods of matrix calculation,’’ Annals of Math. 
Stat., Vol. 14 (1943), pp. 1-34. 
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The necessary steps in computation are 


. _[ .03309 .00735 a'b =—[ 39345 1.00008 

a =| —.01397 .01912 —.47723 —.60000 

_1 _ [—.62060 .21772 cab = [ —12.35708 —21.60096 
ca =I 65375 .78968 —1.24927 14.60256 |" 


Note that a convenient check at this point is to compute both 
(ca ')b and c(a’b) 


1 — cant =| 39-35708 34.0096 
ca 9 = | —33.73073  18.39744 


w—erw'=D= [pier ame] 
-ewe-e=[ ae put 
pes 8=[ fesse — faa 

w-atp=4=| oes 1239} 


The last four of these matrices are the four parts of the inverse, which can be 
written 


| 02873 .02436 — .02302 — .01519 

— 00696 01239 01572 00419 

| 01825 01440 00790 =— pret | 
— 00282 — 02267 01991 .02322 


The accuracy of the computations can be checked by multiplying the original 
matrix by the computed inverse matrix. The product should, of course, be a 
close approximation of the identity matrix. If further accuracy is called for 
we can use Hotelling’s iterative formula, 


Cs = Co(2 - AC») 


where C) is the estimated inverse; A is the original matrix; and C;, is a second ap- 
proximation of the inverse. 
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NEWS AND NOTICES 


Readers are invited to submit to the Secretary of the Institute news items of interest 
Personal Items 


Professor W. G. Cochran of lowa State College has gone overseas as a con- 
sultant for the United States War Department. 

Professor A. R. Crathorne of the University of Illinois has retired with the 
title of Professor Emeritus. 

Professor William Feller of Brown University has been appointed Professor 
of Mathematics at Cornell University, Ithaca, New York, as of July 1, 1945. 

Associate Professor Joe J. Livers has returned to Montana State College at 
Bozeman after receiving his doctorate in February at the University of Michigan. 

Assistant Professor W. A. Vezeau of the University of Detroit has been ap- 
pointed Assistant Professor of Mathematics at St. Louis University. 

Associate Professor 8. 8S. Wilks of Princeton University has been promoted to 
a professorship. 

The American Statistical Association elected ten Fellows during 1944. Of 
these ten, five are members of the Institute. They are A. E. Brandt, W. G. 
Cochran, Gertrude M. Cox, Alan Treloar, and Sewall Wright. The President 
of the Association is Dr. Walter A. Shewhart, a charter member of the Institute 
and its President during 1944. 


(RR 


New Members 


The following persons have been elected to membership in the Institute: 

Allendoerfer, Asso. Prof. Carl B. Ph.D. (Princeton) Haverford College, Haverford, Pa. 

Beckstead, Lt. (j.g.) Gordon L. M.S. (Michigan) Aerologist,U.S.Navy. Aerology, Navy 
#151, c/o Fleet Post Office, San Francisco, Calif. 

Berman, Abraham J. M.A. (Brooklyn) Statistician. 1460 College Avenue, New York, 
N.Y 

Bigelow, Julian H. Asso. Director, Statistical Research Group, Columbia University. 
401 West 118th St., New York 27, N. Y. 

Bowen, Earl K. A.M. (Boston) Instr. Math. Northeastern Univ., Boston, Mass. On 
military leave—Scientific Consultant, Office of Field Service, O.S.R.D. 6 Sibley Ave., 
W. Springfield, Mass. 

Canter, Stanley D. B.S. (Coll. Cityof N.Y.) Statistician, Lerner Shops, Inc., New York, 
N.Y. 2676 Morris Ave., The Bronz, 58, New York, N.Y. 

Cohen, Karl. Ph.D. (Columbia) Physicist, Standard Oil Development Co. Esso Labora- 
tories, Research Division, P. O. Box 243, Elizabeth B, N. J. 

Cooper, William W. A.B. (Chicago) Instr. in Economies, University of Chicago. 6539S. 
Ellis Ave., Chicago 37, Ill. 

Davidson, James H. B.S. (Norwich Univ.) Research Physicist, Hercules Powder Co. 
Box 344, Christiansburg, Va. 

Epstein, Benjamin Ph.D. (Illinois) Staff Assistant, Westinghouse Electric & Mfg. Co., 
Quality Control Dept., Rm. 3-A-17, East Pittsburgh, Pa. 
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Gauthier, Prof. Abel A.M. (Columbia) Prof. of Mathematics, Université de Montreal, 
2900 Mount Royal Blvd., Montreal, Canada. 

Gersten, Lydia Blumenthal B.A. (Hunter) Res. Stat. 1001 Lincoln Place, Brooklyn 13, 
| oe 

Goffman, Casper Ph.D. (Ohio State) Staff Asst., Quality Control Dept., Westinghouse 
Elec. & Mfg. Co., Rm. 3-A-17, East Pittsburgh, Pa. 

Hastay, Millard W. B.A. (Reed) Asso. Math., Stat. Res. Group, Columbia University. 
401 West 118th St., New York 27, N. Y. 

Houseman, Earl E. M.A. (South Dakota) Head Sampling Sec., Stat., Division of Program 
Surveys, Bur. of Agric. Econ., Washington 25, D. C. 

James, R. W. M.A. (Toronto) Asst. to Director, Washington Div., Wartime Prices & 
Trade Board. Room 3068, Railroad Retirement Bldg., Washington, D. C. 

Jones, Robert Richard, Jr. A.B. (Columbia) 61 Jackson St., New Rochelle, N.Y. 

Kac, Asst. Prof. Mark Ph.D. (John Casimir Univ., Lwow) Math. Dept., Whitehall, 
Cornell University, Ithaca, N. Y. 

Knoepfel, Margaret F. A.B. (Brooklyn) Jr. Stat., Weather Bureau, Washington, D. C. 
3306 Ely Place, S.E., Washington 19, D.C. 

Ladd, Robert Boyd M.A. (Texas Coll. of Arts & Industries) Stat. Consultant, OCT, 
Transport Economics, Traffic Control Div., War Dept., Washington, D.C. 903 Wade 
Ave., Rockville, Md. 

Larson, Charles M. B.Sc. (Nebraska) Stat. Analyst, Northrop Aircraft, Inc. 5144 West 
125th St., Hawthorne, Calif. 

Lesansky, William A. B.B.A. (City Coll.of N.Y.) Stat., War Dept., Washington, D. C. 
1841 Summit Place, N.W., Washington 9, D. C. 

Lewis, Wyatt H. B.S. (Calif. Inst. of Tech.) Quality Control Engineer. 212 East H 
Street, Ontario, Calif. 

Mathisen, Ensign Harold C. A.B. (Princeton) Ensign, USNR. 59 Fernwood Road, East 
Orange, N. J. 

Miller, Robert Carmi Res. Engineer, Elgin National Watch Co., Elgin, Ill. 

Mittra, Probodh Chandra B.Sc. (India) Grad. Student in Math. Stat., Columbia Uni- 
versity, New York 27, N. Y. 

Neumann, Prof. John von Ph.D. (Budapest) Institute for Advanced Study, Princeton, 
N. J. 

Noland, Asst. Prof. E. William Ph.D. (Cornell) Dept. of Sociology & Anthropology, 
McGraw Hall, Cornell University, Ithaca, N. Y. 

Okun, Yetta Edith B.A. (Hunter) Res. Asst., Dept. of Labor, Washington, D.C. 2120 
16th St., N.W., Washington 9, D.C. 

Owen, F.V. Ph.D. (Wisconsin) Geneticist, U.S. Dept. of Agric. 18/0 S. Main St., Salt 
Lake City, Utah. 

Poston, Pavl Lehman B.S. (California) Statistician. George Washington Carver Hall, 
211 Elm St., Washington D. C. 

Rice, William B. A.B. (Davidson) Director, Dept. of Stat. & Reports, Plomb Tool Co. 
906 Baldwin Ave., El Monte, Calif. 

Rudnicki, Alex. B.S. (City Coll. of N. Y.) Grad. Student in Math. Stat. 1072 Lorimer 
St., Brooklyn 22, N. Y. 

Rupp, William B. Mer., Quality Control Dept., RCA Victor Div., Radio Corp. of America, 
Harrison, N. J. 29 Dodd St., East Orange, N. J. 

Savage, Leonard J. Ph.D. (Michigan) Res. Math., Stat. Res. Group, Columbia Uni- 
versity, 401 West 118th St., New York 27, N. Y. 

Sheppard, David B.S. (Yale) Statistician, Army Air Forces. 2721 Terrace Road, S.E., 
Washington 20, D. C. 

Smith, Prof. James Gerald Ph.D. (Princeton) Prof.of Economics, Princeton University. 
80 Murray Place, Princeton, N. J. 
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Stigler, Prof. George J. Ph.D. (Chicago) Prof. of Economics, Member, Res. Staff, Na- 
tional Bureau of Econ. Res., University of Minnesota, Minneapolis, Minn. 

Weingarten, Harry M.A. (Columbia) Math. Teacher, School of Aviation Trades. 1330 
Morris Ave., Bronx 56, N. Y. 

Weinstein, Joseph M.S. (C.C.N.Y.) Res. Analyst, Vacuum Tube Tests & Standardiza- 
tion, Camp Evans Signal Lab. Signal Corps. 13 Washington Village, Asbury Park, N.J. 

Westman, A. E.R. Ph.D. (Toronto) Dir. of Chem. Res., Ontario Research Foundation, 
43 Queen’s Park, Toronto 5, Canada. 

Wilcox, Sidney W. L.B. (California) Chief Stat., Bur. of Labor Stat. Room 2318, Dept. 
of Labor, Washington 25, D. C. 

Young, Captain Chen-Pang B.A. (National Tsing Hua Univ., China) Ordnance Dept., 
Chinese Army. 2311 Massachusetts Ave., N. W., Washington 8, D. C 


Corrections to the Directory Published in the December 1944 Issue 


The name of Dr. Walter Schilling was omitted from the Directory. It should 
have appeared as follows: 


Schilling, Walter M.D. (Harvard) Asst. ,Clinical Professor of Medicine 
Stanford University Hospital, San Francisco 15, California. 


? 


The name of Professor Godfrey H. Thomson, Director of the Training of 
Teachers, University of Edinburgh, Edinburgh, Scotland, was misspelled. 
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